Apache Avro
- Avro OCF input format for native batch ingestion.
- .
The Avro Stream Parser is deprecated.
To use the Avro extension, add the to the list of loaded extensions. See Loading extensions for more information.
Avro types
Druid supports most Avro types natively. This section describes some exceptions.
The default mode treats unions as a single value regardless of the type of data populating the union.
If you want to operate on individual members of a union, set extractUnionsByType
on the Avro parser. This configuration expands union values into nested objects according to the following rules:
- Primitive types and unnamed complex types are keyed by their type name, such as
int
andstring
. - The Avro null type is elided as its value can only ever be null.
This is safe because an Avro union can only contain a single member of each unnamed type and duplicates of the same named type are not allowed. For example, only a single array is allowed, multiple records (or other named types) are allowed as long as each has a unique name.
The extension returns bytes
and fixed
Avro types as base64 encoded strings by default. To decode these types as UTF-8 strings, enable the option on the Avro parser.
The extension returns enum
types as string
of the enum symbol.
You can ingest record
and map
types representing nested data with a flattenSpec on the parser.