Running Apache HBase on Alluxio
This guide describes how to run Apache HBase, so that you can easily store HBase tables into Alluxio at various storage levels.
- Alluxio has been set up and is running.
- Make sure that the Alluxio client jar is available. This Alluxio client jar file can be found at
/<PATH_TO_ALLUXIO>/client/alluxio-2.3.0-client.jar
in the tarball downloaded from Alluxio . Alternatively, advanced users can compile this client jar from the source code by following the instructions. - Please follow this guides for setting up HBase.
Apache HBase allows you to use Alluxio through a generic file system wrapper for the Hadoop file system. Therefore, the configuration of Alluxio is done mostly in HBase configuration files.
Set the following properties in conf/hbase-site.xml
and make sure all HBase cluster nodes have the configuration.
Set the hbase.rootdir
property as follows:
You also need to add the FS implementation classes to HBase configuration. These classes are provided in Alluxio Client jar.
<property>
<name>fs.alluxio.impl</name>
</property>
<property>
<name>fs.AbstractFileSystem.alluxio.impl</name>
<value>alluxio.hadoop.AlluxioFileSystem</value>
</property>
Also add the following property to the same file hbase-site.xml
:
<property>
<name>hbase.regionserver.hlog.syncer.count</name>
<value>1</value>
</property>
If you are running HBase version greater than 2.0, add the following property:
<property>
<value>false</value>
</property>
Distribute the Alluxio Client jar
We need to make the Alluxio client jar file available to HBase, because it contains the configured alluxio.hadoop.FileSystem
class.
Specify the location of the jar file in the $HBASE_CLASSPATH
environment variable (make sure it’s available on all cluster nodes). For example:
Alternative ways are described in the
Ensure alluxio scheme is recognized before starting HBase:
$ ${HBASE_HOME}/bin/start-hbase.sh
If not, follow the Usage FAQs as needed.
Visit HBase Web UI at http://<HBASE_MASTER_HOSTNAME>:16010
to confirm that HBase is running on Alluxio (check the HBase Root Directory
attribute):
And visit Alluxio Web UI at http://<ALLUXIO_MASTER_HOSTNAME>:19999
, click Browse
and you can see the files HBase stores on Alluxio, including data and WALs:
create 'test', 'cf'
for i in Array(0..9999)
end
list 'test'
scan 'test', {LIMIT => 10, STARTROW => 'row1'}
get 'test', 'row1'
Run the following command from the top level HBase project directory:
$ bin/hbase shell simple_test.txt
You should see some output like this:
If you have Hadoop installed, you can run a Hadoop-utility program in HBase shell to count the rows of the newly created table:
After this mapreduce job finishes, you can see a result like this:
When Alluxio is running in HA mode, change the hbase.rootdir
property in to use a HA-style Alluxio authority like host1:19998,host2:19998,host3:19998
or zk@host1:2181,host2:2181,host3:2181
.
<property>
<name>hbase.rootdir</name>
<value>alluxio://master_hostname_1:19998,master_hostname_2:19998,master_hostname_3:19998/hbase</value>
</property>
See for more details.
Add additional Alluxio site properties to HBase
If there are any Alluxio site properties you want to specify for HBase, add those to hbase-site.xml
. For example, change alluxio.user.file.writetype.default
from default ASYNC_THROUGH
to CACHE_THROUGH
:
<property>
<name>alluxio.user.file.writetype.default</name>
<value>CACHE_THROUGH</value>
</property>
$ cp `/<PATH_TO_ALLUXIO>/client/alluxio-2.3.0-client.jar` /path/to/hbase-master/lib/
$ cp `/<PATH_TO_ALLUXIO>/client/alluxio-2.3.0-client.jar` /path/to/current/hbase-client/lib/
Logging Configuration
In order to change the logging configuration for HBase, you can modify your installation’s log4j.properties
file.