SQL metadata tables
Druid Brokers infer table and column metadata for each datasource from segments loaded in the cluster, and use this to plan SQL queries. This metadata is cached on Broker startup and also updated periodically in the background through . Background metadata refreshing is triggered by segments entering and exiting the cluster, and can also be throttled through configuration.
Druid exposes system information through special system tables. There are two such schemas available: Information Schema and Sys Schema. Information schema provides details about table and column types. The “sys” schema provides information about Druid internals like segments/tasks/servers.
You can access table and column metadata through JDBC using , or through the INFORMATION_SCHEMA tables described below. For example, to retrieve metadata for the Druid datasource “foo”, use the query:
INFORMATION_SCHEMA.SCHEMATA
provides a list of all known schemas, which include druid
for standard Druid Table datasources, lookup
for , sys
for the virtual System metadata tables, and INFORMATION_SCHEMA
for these virtual tables. Tables are allowed to have the same name across different schemas, so the schema may be included in an SQL statement to distinguish them, e.g. lookup.table
vs druid.table
.
TABLES table
INFORMATION_SCHEMA.TABLES
provides a list of all known tables and schemas.
Column | Notes |
---|---|
TABLE_CATALOG | Always set as druid |
TABLE_SCHEMA | The ‘schema’ which the table falls under, see SCHEMATA table for details |
TABLE_NAME | Table name. For the druid schema, this is the dataSource . |
TABLE_TYPE | “TABLE” or “SYSTEM_TABLE” |
IS_JOINABLE | If a table is directly joinable if on the right hand side of a JOIN statement, without performing a subquery, this value will be set to YES , otherwise NO . Lookups are always joinable because they are globally distributed among Druid query processing nodes, but Druid datasources are not, and will use a less efficient subquery join. |
IS_BROADCAST | If a table is ‘broadcast’ and distributed among all Druid query processing nodes, this value will be set to YES , such as lookups and Druid datasources which have a ‘broadcast’ load rule, else NO . |
INFORMATION_SCHEMA.COLUMNS
provides a list of all known columns across all tables and schema.
SELECT "ORDINAL_POSITION", "COLUMN_NAME", "IS_NULLABLE", "DATA_TYPE", "JDBC_TYPE"
FROM INFORMATION_SCHEMA.COLUMNS
SYSTEM SCHEMA
The “sys” schema provides visibility into Druid segments, servers and tasks.
SEGMENTS table
Segments table provides details on all Druid segments, whether they are published yet or not.
Column | Type | Notes |
---|---|---|
segment_id | STRING | Unique segment identifier |
datasource | STRING | Name of datasource |
start | STRING | Interval start time (in ISO 8601 format) |
end | STRING | Interval end time (in ISO 8601 format) |
size | LONG | Size of segment in bytes |
version | STRING | Version string (generally an ISO8601 timestamp corresponding to when the segment set was first started). Higher version means the more recently created segment. Version comparing is based on string comparison. |
partition_num | LONG | Partition number (an integer, unique within a datasource+interval+version; may not necessarily be contiguous) |
num_replicas | LONG | Number of replicas of this segment currently being served |
num_rows | LONG | Number of rows in current segment, this value could be null if unknown to Broker at query time |
is_published | LONG | Boolean is represented as long type where 1 = true, 0 = false. 1 represents this segment has been published to the metadata store with used=1 . See the Architecture page for more details. |
is_available | LONG | Boolean is represented as long type where 1 = true, 0 = false. 1 if this segment is currently being served by any process(Historical or realtime). See the for more details. |
is_realtime | LONG | Boolean is represented as long type where 1 = true, 0 = false. 1 if this segment is only served by realtime tasks, and 0 if any historical process is serving this segment. |
is_overshadowed | LONG | Boolean is represented as long type where 1 = true, 0 = false. 1 if this segment is published and is fully overshadowed by some other published segments. Currently, is_overshadowed is always false for unpublished segments, although this may change in the future. You can filter for segments that “should be published” by filtering for is_published = 1 AND is_overshadowed = 0 . Segments can briefly be both published and overshadowed if they were recently replaced, but have not been unpublished yet. See the Architecture page for more details. |
shard_spec | STRING | JSON-serialized form of the segment ShardSpec |
dimensions | STRING | JSON-serialized form of the segment dimensions |
metrics | STRING | JSON-serialized form of the segment metrics |
last_compaction_state | STRING | JSON-serialized form of the compaction task’s config (compaction task which created this segment). May be null if segment was not created by compaction task. |
For example to retrieve all segments for datasource “wikipedia”, use the query:
SELECT * FROM sys.segments WHERE datasource = 'wikipedia'
Another example to retrieve segments total_size, avg_size, avg_num_rows and num_segments per datasource:
This query goes a step further and shows the overall profile of available, non-realtime segments across buckets of 1 million rows each for the foo
datasource:
SELECT ABS("num_rows" / 1000000) as "bucket",
COUNT(*) as segments,
SUM("size") / 1048576 as totalSizeMiB,
MIN("size") / 1048576 as minSizeMiB,
AVG("size") / 1048576 as averageSizeMiB,
MAX("size") / 1048576 as maxSizeMiB,
SUM("num_rows") as totalRows,
MIN("num_rows") as minRows,
AVG("num_rows") as averageRows,
MAX("num_rows") as maxRows,
(AVG("size") / AVG("num_rows")) as avgRowSizeB
WHERE is_available = 1 AND is_realtime = 0 AND "datasource" = `foo`
ORDER BY 1
If you want to retrieve segment that was compacted (ANY compaction):
SELECT * FROM sys.segments WHERE last_compaction_state is not null
Caveat: Note that a segment can be served by more than one stream ingestion tasks or Historical processes, in that case it would have multiple replicas. These replicas are weakly consistent with each other when served by multiple ingestion tasks, until a segment is eventually served by a Historical, at that point the segment is immutable. Broker prefers to query a segment from Historical over an ingestion task. But if a segment has multiple realtime replicas, for e.g.. Kafka index tasks, and one task is slower than other, then the sys.segments query results can vary for the duration of the tasks because only one of the ingestion tasks is queried by the Broker and it is not guaranteed that the same task gets picked every time. The num_rows
column of segments table can have inconsistent values during this period. There is an open about this inconsistency with stream ingestion tasks.
Servers table lists all discovered servers in the cluster.
To retrieve information about all servers, use the query:
SELECT * FROM sys.servers;
SERVER_SEGMENTS table
SERVER_SEGMENTS is used to join servers with segments table
Column | Type | Notes |
---|---|---|
server | STRING | Server name in format host:port (Primary key of servers table) |
segment_id | STRING | Segment identifier (Primary key of ) |
JOIN between “servers” and “segments” can be used to query the number of segments for a specific datasource, grouped by server, example query:
SELECT count(segments.segment_id) as num_segments from sys.segments as segments
INNER JOIN sys.server_segments as server_segments
ON segments.segment_id = server_segments.segment_id
INNER JOIN sys.servers as servers
ON servers.server = server_segments.server
WHERE segments.datasource = 'wikipedia'
GROUP BY servers.server;
The tasks table provides information about active and recently-completed indexing tasks. For more information check out the documentation for .
For example, to retrieve tasks information filtered by status, use the query
SUPERVISORS table
Column | Type | Notes |
---|---|---|
supervisor_id | STRING | Supervisor task identifier |
state | STRING | Basic state of the supervisor. Available states: UNHEALTHY_SUPERVISOR , UNHEALTHY_TASKS , PENDING , RUNNING , SUSPENDED , STOPPING . Check for details. |
detailed_state | STRING | Supervisor specific state. (See documentation of the specific supervisor for details, e.g. Kafka or ) |
healthy | LONG | Boolean represented as long type where 1 = true, 0 = false. 1 indicates a healthy supervisor |
type | STRING | Type of supervisor, e.g. kafka , kinesis or materialized_view |
source | STRING | Source of the supervisor, e.g. Kafka topic or Kinesis stream |
suspended | LONG | Boolean represented as long type where 1 = true, 0 = false. 1 indicates supervisor is in suspended state |
spec | STRING | JSON-serialized supervisor spec |
For example, to retrieve supervisor tasks information filtered by health status, use the query