Hive Beta
Flink offers a two-fold integration with Hive.The first is to leverage Hive’s Metastore as a persistent catalog for storing Flink specific metadata across sessions.The second is to offer Flink as an alternative engine for reading and writing Hive tables.
The hive catalog is designed to be “out of the box” compatible with existing Hive installations.You do not need to modify your existing Hive Metastore or change the data placement or partitioning of your tables.
If you use a different minor Hive version such as 1.2.2
or 2.3.1
, it should also be ok to choose the closest version 1.2.1
(for 1.2.2
) or 2.3.4
(for 2.3.1
) to workaround. For example, you want to use Flink to integrate hive version in sql client, just set the hive-version to 2.3.4
in YAML config. Similarly pass the version string when creating HiveCatalog instance via Table API.
Users are welcome to try out different versions with this workaround. Since only 2.3.4
and 1.2.1
have been tested, there might be unexpected issues. We will test and support more versions in future releases.
Connect to an existing Hive installation using the Hive through the table environment or YAML configuration.
Currently HiveCatalog
supports most Flink data types with the following mapping:
Limitations
CHAR(p)
has a maximum length of 255- Hive’s
MAP
only supports primitive key types while Flink’s can be any data type - Hive’s
UNION
type is not supported - Flink’s
INTERVAL
type cannot be mapped to HiveINTERVAL
type - Flink’s
TIMESTAMP_WITH_TIME_ZONE
andTIMESTAMP_WITH_LOCAL_TIME_ZONE
are not supported by Hive - Flink’s
TIMESTAMP_WITHOUT_TIME_ZONE
type cannot be mapped to Hive’sTIMESTAMP
type due to precision difference.