Query granularities

    The granularity field determines how data gets bucketed across the time dimension, or how it gets aggregated by hour, day, minute, etc.

    It can be specified either as a string for simple granularities or as an object for arbitrary granularities.

    Simple granularities are specified as a string and bucket timestamps by their UTC time (e.g., days start at 00:00 UTC).

    Supported granularity strings are: , none, second, minute, fifteen_minute, thirty_minute, hour, day, week, month, quarter and year.

    • all buckets everything into a single bucket
    • none does not bucket data (it actually uses the granularity of the index - minimum here is none which means millisecond granularity). Using none in a TimeseriesQuery is currently not recommended (the system will try to generate 0 values for all milliseconds that didn’t exist, which is often a lot).

    Example:

    Suppose you have data below stored in Apache Druid with millisecond ingestion granularity,

    After submitting a groupBy query with hour granularity,

    1. {
    2. "queryType":"groupBy",
    3. "dataSource":"my_dataSource",
    4. "granularity":"hour",
    5. "dimensions":[
    6. "language"
    7. ],
    8. "aggregations":[
    9. {
    10. "type":"count",
    11. "name":"count"
    12. }
    13. ],
    14. "intervals":[
    15. "2000-01-01T00:00Z/3000-01-01T00:00Z"
    16. ]
    17. }

    you will get

    1. [ {
    2. "version" : "v1",
    3. "timestamp" : "2013-08-31T01:00:00.000Z",
    4. "event" : {
    5. "count" : 1,
    6. "language" : "en"
    7. }
    8. }, {
    9. "version" : "v1",
    10. "timestamp" : "2013-09-01T01:00:00.000Z",
    11. "event" : {
    12. "count" : 1,
    13. "language" : "en"
    14. }
    15. }, {
    16. "version" : "v1",
    17. "timestamp" : "2013-09-02T23:00:00.000Z",
    18. "event" : {
    19. "count" : 1,
    20. "language" : "en"
    21. }
    22. }, {
    23. "version" : "v1",
    24. "timestamp" : "2013-09-03T03:00:00.000Z",
    25. "event" : {
    26. "count" : 1,
    27. "language" : "en"
    28. }
    29. } ]

    Note that all the empty buckets are discarded.

    If you change the granularity to day, you will get

    1. [ {
    2. "version" : "v1",
    3. "timestamp" : "2013-08-31T00:00:00.000Z",
    4. "event" : {
    5. "count" : 1,
    6. "language" : "en"
    7. }
    8. }, {
    9. "version" : "v1",
    10. "timestamp" : "2013-09-01T00:00:00.000Z",
    11. "event" : {
    12. "count" : 1,
    13. "language" : "en"
    14. }
    15. }, {
    16. "version" : "v1",
    17. "timestamp" : "2013-09-02T00:00:00.000Z",
    18. "event" : {
    19. "count" : 1,
    20. "language" : "en"
    21. }
    22. }, {
    23. "version" : "v1",
    24. "timestamp" : "2013-09-03T00:00:00.000Z",
    25. "event" : {
    26. "count" : 1,
    27. "language" : "en"
    28. }
    29. } ]

    If you change the granularity to none, you will get the same results as setting it to the ingestion granularity.

    1. [ {
    2. "version" : "v1",
    3. "timestamp" : "2013-08-31T01:02:33.000Z",
    4. "language" : "en"
    5. }
    6. }, {
    7. "version" : "v1",
    8. "timestamp" : "2013-09-01T01:02:33.000Z",
    9. "event" : {
    10. "count" : 1,
    11. "language" : "en"
    12. }
    13. }, {
    14. "version" : "v1",
    15. "timestamp" : "2013-09-02T23:32:45.000Z",
    16. "event" : {
    17. "count" : 1,
    18. "language" : "en"
    19. }
    20. }, {
    21. "version" : "v1",
    22. "timestamp" : "2013-09-03T03:32:45.000Z",
    23. "event" : {
    24. "count" : 1,
    25. "language" : "en"
    26. }
    27. } ]

    If you change the granularity to all, you will get everything aggregated in 1 bucket,

    1. [ {
    2. "version" : "v1",
    3. "timestamp" : "2000-01-01T00:00:00.000Z",
    4. "event" : {
    5. "count" : 4,
    6. "language" : "en"
    7. }
    8. } ]

    Duration granularities are specified as an exact duration in milliseconds and timestamps are returned as UTC. Duration granularity values are in millis.

    They also support specifying an optional origin, which defines where to start counting time buckets from (defaults to 1970-01-01T00:00:00Z).

    This chunks up every 2 hours.

    1. {"type": "duration", "duration": 3600000, "origin": "2012-01-01T00:30:00Z"}

    This chunks up every hour on the half-hour.

    Example:

    Reusing the data in the previous example, after submitting a groupBy query with 24 hours duration,

    1. {
    2. "queryType":"groupBy",
    3. "dataSource":"my_dataSource",
    4. "granularity":{"type": "duration", "duration": "86400000"},
    5. "dimensions":[
    6. "language"
    7. ],
    8. "aggregations":[
    9. {
    10. "type":"count",
    11. "name":"count"
    12. }
    13. ],
    14. "intervals":[
    15. "2000-01-01T00:00Z/3000-01-01T00:00Z"
    16. ]
    17. }

    you will get

    1. [ {
    2. "version" : "v1",
    3. "timestamp" : "2013-08-31T00:00:00.000Z",
    4. "event" : {
    5. "count" : 1,
    6. "language" : "en"
    7. }
    8. }, {
    9. "version" : "v1",
    10. "timestamp" : "2013-09-01T00:00:00.000Z",
    11. "event" : {
    12. "count" : 1,
    13. "language" : "en"
    14. }
    15. }, {
    16. "version" : "v1",
    17. "timestamp" : "2013-09-02T00:00:00.000Z",
    18. "event" : {
    19. "count" : 1,
    20. "language" : "en"
    21. }
    22. }, {
    23. "version" : "v1",
    24. "timestamp" : "2013-09-03T00:00:00.000Z",
    25. "event" : {
    26. "count" : 1,
    27. "language" : "en"
    28. }
    29. } ]

    if you set the origin for the granularity to 2012-01-01T00:30:00Z,

    1. "granularity":{"type": "duration", "duration": "86400000", "origin":"2012-01-01T00:30:00Z"}

    you will get

    1. [ {
    2. "version" : "v1",
    3. "timestamp" : "2013-08-31T00:30:00.000Z",
    4. "event" : {
    5. "count" : 1,
    6. "language" : "en"
    7. }
    8. }, {
    9. "event" : {
    10. "count" : 1,
    11. "language" : "en"
    12. }
    13. }, {
    14. "version" : "v1",
    15. "timestamp" : "2013-09-02T00:30:00.000Z",
    16. "event" : {
    17. "count" : 1,
    18. "language" : "en"
    19. }
    20. }, {
    21. "version" : "v1",
    22. "timestamp" : "2013-09-03T00:30:00.000Z",
    23. "event" : {
    24. "count" : 1,
    25. "language" : "en"
    26. }
    27. } ]

    Note that the timestamp for each bucket starts at the 30th minute.

    Time zone is optional (defaults to UTC). Origin is optional (defaults to 1970-01-01T00:00:00 in the given time zone).

    This will bucket by two-day chunks in the Pacific timezone.

    1. {"type": "period", "period": "P3M", "timeZone": "America/Los_Angeles", "origin": "2012-02-01T00:00:00-08:00"}

    This will bucket by 3-month chunks in the Pacific timezone where the three-month quarters are defined as starting from February.

    Example

    Reusing the data in the previous example, if you submit a groupBy query with 1 day period in Pacific timezone,

    1. {
    2. "queryType":"groupBy",
    3. "dataSource":"my_dataSource",
    4. "granularity":{"type": "period", "period": "P1D", "timeZone": "America/Los_Angeles"},
    5. "dimensions":[
    6. "language"
    7. ],
    8. "aggregations":[
    9. {
    10. "type":"count",
    11. "name":"count"
    12. }
    13. ],
    14. "intervals":[
    15. "1999-12-31T16:00:00.000-08:00/2999-12-31T16:00:00.000-08:00"
    16. ]
    17. }

    you will get

    1. [ {
    2. "version" : "v1",
    3. "timestamp" : "2013-08-30T00:00:00.000-07:00",
    4. "event" : {
    5. "count" : 1,
    6. "language" : "en"
    7. }
    8. }, {
    9. "version" : "v1",
    10. "timestamp" : "2013-08-31T00:00:00.000-07:00",
    11. "event" : {
    12. "count" : 1,
    13. "language" : "en"
    14. }
    15. }, {
    16. "version" : "v1",
    17. "timestamp" : "2013-09-02T00:00:00.000-07:00",
    18. "event" : {
    19. "count" : 2,
    20. "language" : "en"
    21. }
    22. } ]

    Note that the timestamp for each bucket has been converted to Pacific time. Row {"timestamp": "2013-09-02T23:32:45Z", "page": "CCC", "language" : "en"} and {"timestamp": "2013-09-03T03:32:45Z", "page": "DDD", "language" : "en"} are put in the same bucket because they are in the same day in Pacific time.

    Also note that the intervals in groupBy query will not be converted to the timezone specified, the timezone specified in granularity is only applied on the query results.

    If you set the origin for the granularity to 1970-01-01T20:30:00-08:00,

    1. "granularity":{"type": "period", "period": "P1D", "timeZone": "America/Los_Angeles", "origin": "1970-01-01T20:30:00-08:00"}

    you will get

    1. [ {
    2. "version" : "v1",
    3. "timestamp" : "2013-08-29T20:30:00.000-07:00",
    4. "event" : {
    5. "count" : 1,
    6. "language" : "en"
    7. }
    8. }, {
    9. "version" : "v1",
    10. "timestamp" : "2013-08-30T20:30:00.000-07:00",
    11. "event" : {
    12. "count" : 1,
    13. "language" : "en"
    14. }
    15. }, {
    16. "version" : "v1",
    17. "timestamp" : "2013-09-01T20:30:00.000-07:00",
    18. "event" : {
    19. "count" : 1,
    20. "language" : "en"
    21. }
    22. }, {
    23. "version" : "v1",
    24. "timestamp" : "2013-09-02T20:30:00.000-07:00",
    25. "event" : {
    26. "count" : 1,
    27. "language" : "en"
    28. } ]

    Note that the origin you specified has nothing to do with the timezone, it only serves as a starting point for locating the very first granularity bucket. In this case, Row {"timestamp": "2013-09-02T23:32:45Z", "page": "CCC", "language" : "en"} and {"timestamp": "2013-09-03T03:32:45Z", "page": "DDD", "language" : "en"} are not in the same bucket.

    Supported Time Zones