HBase SQL Connector

    The HBase connector allows for reading from and writing to an HBase cluster. This document describes how to setup the HBase Connector to run SQL queries against HBase.

    HBase always works in upsert mode for exchange changelog messages with the external system using a primary key defined on the DDL. The primary key must be defined on the HBase rowkey field (rowkey field must be declared). If the PRIMARY KEY clause is not declared, the HBase connector will take rowkey as the primary key by default.

    All the column families in HBase table must be declared as ROW type, the field name maps to the column family name, and the nested field names map to the column qualifier names. There is no need to declare all the families and qualifiers in the schema, users can declare what’s used in the query. Except the ROW type fields, the single atomic type field (e.g. STRING, BIGINT) will be recognized as HBase rowkey. The rowkey field can be arbitrary name, but should be quoted using backticks if it is a reserved keyword.

    1. -- register the HBase table 'mytable' in Flink SQL
    2. CREATE TABLE hTable (
    3. rowkey INT,
    4. family1 ROW<q1 INT>,
    5. family2 ROW<q2 STRING, q3 BIGINT>,
    6. family3 ROW<q4 DOUBLE, q5 BOOLEAN, q6 STRING>,
    7. PRIMARY KEY (rowkey) NOT ENFORCED
    8. 'connector' = 'hbase-1.4',
    9. 'table-name' = 'mytable',
    10. 'zookeeper.quorum' = 'localhost:2181'
    11. );
    12. -- use ROW(...) construction function construct column families and write data into the HBase table.
    13. -- assuming the schema of "T" is [rowkey, f1q1, f2q2, f2q3, f3q4, f3q5, f3q6]
    14. INSERT INTO hTable
    15. SELECT rowkey, ROW(f1q1), ROW(f2q2, f2q3), ROW(f3q4, f3q5, f3q6) FROM T;
    16. -- scan data from the HBase table
    17. -- temporal join the HBase table as a dimension table
    18. SELECT * FROM myTopic
    19. LEFT JOIN hTable FOR SYSTEM_TIME AS OF myTopic.proctime

    HBase stores all data as byte arrays. The data needs to be serialized and deserialized during read and write operation

    Flink HBase connector encodes null values to empty bytes, and decode empty bytes to null values for all data types except string type. For string type, the null literal is determined by null-string-literal option.

    The data type mappings are as follows: