DataSketches HLL Sketch module

    To use this aggregator, make sure you include the extension in your config file:

    HLLSketchBuild Aggregator

    1. {
    2. "name" : <output name>,
    3. "lgK" : <size and accuracy parameter>,
    4. "tgtHllType" : <target HLL type>,
    5. "round": <false | true>
    6. }

    It is very common to use HLLSketchBuild in combination with to create a metric on high-cardinality columns. In this example, a metric called userid_hll is included in the metricsSpec. This will perform a HLL sketch on the userid field at ingestion time, allowing for highly-performant approximate COUNT DISTINCT query operations and improving roll-up ratios when userid is then left out of the dimensionsSpec.

    HLLSketchMerge Aggregator

    1. {
    2. "type" : "HLLSketchMerge",
    3. "fieldName" : <metric name>,
    4. "lgK" : <size and accuracy parameter>,
    5. "tgtHllType" : <target HLL type>,
    6. "round": <false | true>

    Post Aggregators

    Estimate

    Estimate with bounds

    Returns a distinct count estimate and error bounds from an HLL sketch. The result will be an array containing three double values: estimate, lower bound and upper bound. The bounds are provided at a given number of standard deviations (optional, defaults to 1). This must be an integer value of 1, 2 or 3 corresponding to approximately 68.3%, 95.4% and 99.7% confidence intervals.

    1. {
    2. "type" : "HLLSketchEstimateWithBounds",
    3. "name": <output name>,
    4. "field" : <post aggregator that returns an HLL Sketch>,
    5. "numStdDev" : <number of standard deviations: 1 (default), 2 or 3>
    6. }

    Union

    Sketch to string

    1. {
    2. "type" : "HLLSketchToString",
    3. "name": <output name>,
    4. "field" : <post aggregator that returns an HLL Sketch>