ORC Extension

    • In inputSpec of ioConfig, inputFormat must be changed from "org.apache.hadoop.hive.ql.io.orc.OrcNewInputFormat" to "org.apache.orc.mapreduce.OrcInputFormat"
    • The ‘contrib’ extension supported a typeString property, which provided the schema of the ORC file, of which was essentially required to have the types correct, but notably not the column names, which facilitated column renaming. In the ‘core’ extension, column renaming can be achieved with flattenSpec. For example, with the actual schema struct<_col0:string,_col1:string>, to preserve Druid schema would need replaced with:
    1. "flattenSpec": {
    2. {
    3. "type": "path",
    4. "expr": "$.nestedData.dim1"
    5. }
    6. ]