DataSketches HLL Sketch module
To use this aggregator, make sure you include the extension in your config file:
HLLSketchBuild Aggregator
{
"name" : <output name>,
"lgK" : <size and accuracy parameter>,
"tgtHllType" : <target HLL type>,
"round": <false | true>
}
It is very common to use
HLLSketchBuild
in combination with to create a metric on high-cardinality columns. In this example, a metric calleduserid_hll
is included in themetricsSpec
. This will perform a HLL sketch on theuserid
field at ingestion time, allowing for highly-performant approximateCOUNT DISTINCT
query operations and improving roll-up ratios whenuserid
is then left out of thedimensionsSpec
.
HLLSketchMerge Aggregator
{
"type" : "HLLSketchMerge",
"fieldName" : <metric name>,
"lgK" : <size and accuracy parameter>,
"tgtHllType" : <target HLL type>,
"round": <false | true>
Post Aggregators
Estimate
Estimate with bounds
Returns a distinct count estimate and error bounds from an HLL sketch. The result will be an array containing three double values: estimate, lower bound and upper bound. The bounds are provided at a given number of standard deviations (optional, defaults to 1). This must be an integer value of 1, 2 or 3 corresponding to approximately 68.3%, 95.4% and 99.7% confidence intervals.
{
"type" : "HLLSketchEstimateWithBounds",
"name": <output name>,
"field" : <post aggregator that returns an HLL Sketch>,
"numStdDev" : <number of standard deviations: 1 (default), 2 or 3>
}
Union
Sketch to string
{
"type" : "HLLSketchToString",
"name": <output name>,
"field" : <post aggregator that returns an HLL Sketch>