Running Apache HBase on Alluxio

    This guide describes how to run Apache HBase, so that you can easily store HBase tables into Alluxio at various storage levels.

    • Alluxio has been set up and is running.
    • Make sure that the Alluxio client jar is available. This Alluxio client jar file can be found at /<PATH_TO_ALLUXIO>/client/alluxio-2.3.0-client.jar in the tarball downloaded from Alluxio . Alternatively, advanced users can compile this client jar from the source code by following the instructions.
    • Please follow this guides for setting up HBase.

    Apache HBase allows you to use Alluxio through a generic file system wrapper for the Hadoop file system. Therefore, the configuration of Alluxio is done mostly in HBase configuration files.

    Set the following properties in conf/hbase-site.xml and make sure all HBase cluster nodes have the configuration.

    Set the hbase.rootdir property as follows:

    You also need to add the FS implementation classes to HBase configuration. These classes are provided in Alluxio Client jar.

    1. <property>
    2. <name>fs.alluxio.impl</name>
    3. </property>
    4. <property>
    5. <name>fs.AbstractFileSystem.alluxio.impl</name>
    6. <value>alluxio.hadoop.AlluxioFileSystem</value>
    7. </property>

    Also add the following property to the same file hbase-site.xml:

    1. <property>
    2. <name>hbase.regionserver.hlog.syncer.count</name>
    3. <value>1</value>
    4. </property>

    If you are running HBase version greater than 2.0, add the following property:

    1. <property>
    2. <value>false</value>
    3. </property>

    Distribute the Alluxio Client jar

    We need to make the Alluxio client jar file available to HBase, because it contains the configured alluxio.hadoop.FileSystem class.

    Specify the location of the jar file in the $HBASE_CLASSPATH environment variable (make sure it’s available on all cluster nodes). For example:

    Alternative ways are described in the

    Ensure alluxio scheme is recognized before starting HBase:

    1. $ ${HBASE_HOME}/bin/start-hbase.sh

    If not, follow the Usage FAQs as needed.

    Visit HBase Web UI at http://<HBASE_MASTER_HOSTNAME>:16010 to confirm that HBase is running on Alluxio (check the HBase Root Directory attribute):

    And visit Alluxio Web UI at http://<ALLUXIO_MASTER_HOSTNAME>:19999, click Browse and you can see the files HBase stores on Alluxio, including data and WALs:

    HBaseRootDirectoryOnAlluxio

    1. create 'test', 'cf'
    2. for i in Array(0..9999)
    3. end
    4. list 'test'
    5. scan 'test', {LIMIT => 10, STARTROW => 'row1'}
    6. get 'test', 'row1'

    Run the following command from the top level HBase project directory:

    1. $ bin/hbase shell simple_test.txt

    You should see some output like this:

    If you have Hadoop installed, you can run a Hadoop-utility program in HBase shell to count the rows of the newly created table:

    After this mapreduce job finishes, you can see a result like this:

    HBaseHadoopOutput

    When Alluxio is running in HA mode, change the hbase.rootdir property in to use a HA-style Alluxio authority like host1:19998,host2:19998,host3:19998 or zk@host1:2181,host2:2181,host3:2181.

    1. <property>
    2. <name>hbase.rootdir</name>
    3. <value>alluxio://master_hostname_1:19998,master_hostname_2:19998,master_hostname_3:19998/hbase</value>
    4. </property>

    See for more details.

    Add additional Alluxio site properties to HBase

    If there are any Alluxio site properties you want to specify for HBase, add those to hbase-site.xml. For example, change alluxio.user.file.writetype.default from default ASYNC_THROUGH to CACHE_THROUGH:

    1. <property>
    2. <name>alluxio.user.file.writetype.default</name>
    3. <value>CACHE_THROUGH</value>
    4. </property>
    1. $ cp `/<PATH_TO_ALLUXIO>/client/alluxio-2.3.0-client.jar` /path/to/hbase-master/lib/
    2. $ cp `/<PATH_TO_ALLUXIO>/client/alluxio-2.3.0-client.jar` /path/to/current/hbase-client/lib/

    Logging Configuration

    In order to change the logging configuration for HBase, you can modify your installation’s log4j.properties file.