DataX doriswriter

    The plug-in uses Doris’ Stream Load function to synchronize and import data. It needs to be used with DataX service.

    DataX is an open source version of Alibaba Cloud DataWorks data integration, an offline data synchronization tool/platform widely used in Alibaba Group. DataX implements efficient data synchronization functions between various heterogeneous data sources including MySQL, Oracle, SqlServer, Postgre, HDFS, Hive, ADS, HBase, TableStore (OTS), MaxCompute (ODPS), Hologres, DRDS, etc.

    More details can be found at:

    Usage

    The code of DataX doriswriter plug-in can be found here.

    This directory is the doriswriter plug-in development environment of Alibaba DataX.

    Because the doriswriter plug-in depends on some modules in the DataX code base, and these module dependencies are not submitted to the official Maven repository, when we develop the doriswriter plug-in, we need to download the complete DataX code base to facilitate our development and compilation of the doriswriter plug-in.

    1. init-env.sh

      1. Git clone the DataX code base to the local

      2. Softlink the doriswriter/ directory to DataX/doriswriter.

      3. Add to the original DataX/pom.xml

      4. Change httpclient version from 4.5 to 4.5.13 in DataX/core/pom.xml

      After that, developers can enter DataX/ for development. And the changes in the DataX/doriswriter directory will be reflected in the doriswriter/ directory, which is convenient for developers to submit code.

    1. Run init-env.sh

    2. Modify code of doriswriter in if you need.

    3. Commit code of doriswriter in if you need.

    1. Stream reads the data and imports it to Doris

    For instructions on using the doriswriter plug-in, please refer to .

    2.Mysql reads the data and imports it to Doris

    1.Mysql table structure

    2.Doris table structure

    4.Execute the datax task, refer to the specific