Window and aggregate data with Flux

If you’re just getting started with Flux queries, check out the following:

  • for a conceptual overview of Flux and parts of a Flux query.
  • Execute queries to discover a variety of ways to run your queries.

The following example is an in-depth walk-through of the steps required to window and aggregate data. The performs these operations for you, but understanding how data is shaped in the process helps to successfully create your desired output.

For the purposes of this guide, define a variable that represents your base data set. The following example queries the memory usage of the host machine.

This example drops the column from the returned data since the memory data is only tracked for a single host and it simplifies the output tables. Dropping the host column is optional and not recommended if monitoring memory on multiple hosts.

dataSet can now be used to represent your base data, which will look similar to the following:

  1. Table: keys: [_start, _stop, _field, _measurement]
  2. _start:time _stop:time _field:string _measurement:string _time:time _value:float
  3. ------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
  4. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:00.000000000Z 71.11611366271973
  5. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:10.000000000Z 67.39630699157715
  6. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:20.000000000Z 64.16666507720947
  7. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:30.000000000Z 64.19951915740967
  8. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:40.000000000Z 64.2122745513916
  9. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:50.000000000Z 64.22209739685059
  10. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:00.000000000Z 64.6336555480957
  11. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:10.000000000Z 64.16516304016113
  12. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:20.000000000Z 64.18349742889404
  13. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:30.000000000Z 64.20474052429199
  14. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:40.000000000Z 68.65062713623047
  15. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:50.000000000Z 67.20139980316162
  16. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:00.000000000Z 70.9143877029419
  17. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:10.000000000Z 64.14549350738525
  18. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:20.000000000Z 64.15379047393799
  19. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:30.000000000Z 64.1592264175415
  20. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:40.000000000Z 64.18190002441406
  21. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:50.000000000Z 64.28837776184082
  22. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:00.000000000Z 64.29731845855713
  23. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:10.000000000Z 64.36963081359863
  24. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:20.000000000Z 64.37397003173828
  25. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:30.000000000Z 64.44413661956787
  26. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:40.000000000Z 64.42906856536865
  27. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:50.000000000Z 64.44573402404785
  28. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:00.000000000Z 64.48912620544434
  29. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:10.000000000Z 64.49522972106934
  30. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:30.000000000Z 64.49949741363525
  31. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:40.000000000Z 64.4949197769165
  32. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:50.000000000Z 64.49787616729736
  33. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:55:00.000000000Z 64.49816226959229

Use the window() function to group your data based on time bounds. The most common parameter passed with the window() is every which defines the duration of time between windows. Other parameters are available, but for this example, window the base data set into one minute windows.

  1. dataSet
  2. |> window(every: 1m)

The every parameter supports all , including calendar months (1mo) and years (1y).

window() output tables

When visualized in the InfluxDB UI, each window table is displayed in a different color.

take the values of all rows in a table and use them to perform an aggregate operation. The result is output as a new value in a single-row table.

Since windowed data is split into separate tables, aggregate operations run against each table separately and output new tables containing only the aggregated value.

For this example, use the mean() function to output the average of each window:

  1. |> window(every: 1m)
  2. |> mean()
mean() output tables
  1. Table: keys: [_start, _stop, _field, _measurement]
  2. _start:time _stop:time _field:string _measurement:string _value:float
  3. ------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
  4. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 65.88549613952637
  5. Table: keys: [_start, _stop, _field, _measurement]
  6. _start:time _stop:time _field:string _measurement:string _value:float
  7. ------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
  8. 2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 65.50651391347249
  9. Table: keys: [_start, _stop, _field, _measurement]
  10. _start:time _stop:time _field:string _measurement:string _value:float
  11. ------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
  12. 2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 65.30719598134358
  13. Table: keys: [_start, _stop, _field, _measurement]
  14. _start:time _stop:time _field:string _measurement:string _value:float
  15. ------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
  16. 2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 64.39330975214641
  17. Table: keys: [_start, _stop, _field, _measurement]
  18. _start:time _stop:time _field:string _measurement:string _value:float
  19. ------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
  20. 2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 64.49386278788249
  21. Table: keys: [_start, _stop, _field, _measurement]
  22. _start:time _stop:time _field:string _measurement:string _value:float
  23. ------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
  24. 2018-11-03T17:55:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 64.49816226959229

Because each data point is contained in its own table, when visualized, they appear as single, unconnected points.

Aggregated windowed data

Also notice the _start and _stop columns still exist. These represent the lower and upper bounds of the time window.

Many Flux functions rely on the _time column. To further process your data after an aggregate function, you need to re-add _time. Use the duplicate() function to duplicate either the _start or _stop column as a new column.

duplicate() output tables
  1. _start:time _stop:time _field:string _measurement:string _time:time _value:float
  2. ------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
  3. 2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:51:00.000000000Z 65.88549613952637
  4. Table: keys: [_start, _stop, _field, _measurement]
  5. _start:time _stop:time _field:string _measurement:string _time:time _value:float
  6. ------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
  7. 2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:52:00.000000000Z 65.50651391347249
  8. Table: keys: [_start, _stop, _field, _measurement]
  9. _start:time _stop:time _field:string _measurement:string _time:time _value:float
  10. ------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
  11. 2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:53:00.000000000Z 65.30719598134358
  12. Table: keys: [_start, _stop, _field, _measurement]
  13. _start:time _stop:time _field:string _measurement:string _time:time _value:float
  14. ------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
  15. 2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:54:00.000000000Z 64.39330975214641
  16. Table: keys: [_start, _stop, _field, _measurement]
  17. _start:time _stop:time _field:string _measurement:string _time:time _value:float
  18. ------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
  19. 2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:55:00.000000000Z 64.49386278788249
  20. Table: keys: [_start, _stop, _field, _measurement]
  21. _start:time _stop:time _field:string _measurement:string _time:time _value:float
  22. ------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
  23. 2018-11-03T17:55:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:55:00.000000000Z 64.49816226959229

Keeping aggregate values in separate tables generally isn’t the format in which you want your data. Use the window() function to “unwindow” your data into a single infinite (inf) window.

  1. dataSet
  2. |> window(every: 1m)
  3. |> mean()
  4. |> duplicate(column: "_stop", as: "_time")
  5. |> window(every: inf)

Windowing requires a _time column which is why it’s necessary to recreate the _time column after an aggregation.

Unwindowed output table

With the aggregate values in a single table, data points in the visualization are connected.

You have now created a Flux query that windows and aggregates data. The data transformation process outlined in this guide should be used for all aggregation operations.

The following Flux query will return the same results:

aggregateWindow function
  1. dataSet