Parquet Format

    The Apache Parquet format allows to read and write Parquet data.

    Here is an example to create a table using Filesystem connector and Parquet format.

    1. item_id BIGINT,
    2. category_id BIGINT,
    3. ts TIMESTAMP(3),
    4. dt STRING
    5. 'path' = '/tmp/user_behavior',
    6. 'format' = 'parquet'
    7. )

    Currently, Parquet format type mapping is compatible with Apache Hive, but different with Apache Spark:

    • Timestamp: mapping timestamp type to int96 whatever the precision is.