DataSketches Tuple Sketch module

    To use this aggregator, make sure you include the extension in your config file:

    1. "type" : "arrayOfDoublesSketch",
    2. "name" : <output_name>,
    3. "fieldName" : <metric_name>,
    4. "nominalEntries": <number>,
    5. "numberOfValues" : <number>,
    6. "metricColumns" : <array of strings>

    Post Aggregators

    Estimate of the number of distinct keys

    Returns a distinct count estimate from a given ArrayOfDoublesSketch.

    1. {
    2. "type" : "arrayOfDoublesSketchToEstimate",
    3. "name": <output name>,
    4. "field" : <post aggregator that refers to an ArrayOfDoublesSketch (fieldAccess or another post aggregator)>
    5. }

    Estimate of the number of distinct keys with error bounds

    Number of retained entries

    Returns the number of retained entries from a given ArrayOfDoublesSketch.

    1. {
    2. "name": <output name>,
    3. "field" : <post aggregator that refers to an ArrayOfDoublesSketch (fieldAccess or another post aggregator)>
    4. }

    Mean values for each column

    Returns a list of mean values from a given ArrayOfDoublesSketch. The result will be N double values, where N is the number of double values kept in the sketch per key.

    1. {
    2. "type" : "arrayOfDoublesSketchToMeans",
    3. "name": <output name>,
    4. "field" : <post aggregator that refers to a DoublesSketch (fieldAccess or another post aggregator)>
    5. }

    Variance values for each column

    Quantiles sketch from a column

    Returns a quantiles DoublesSketch constructed from a given column of values from a given ArrayOfDoublesSketch using optional parameter k that determines the accuracy and size of the quantiles sketch. See

    • The column number is 1-based and is optional (the default is 1).
    • The parameter k is optional (the default is defined in the sketch library).
    • The result is a quantiles sketch.
    1. "type" : "arrayOfDoublesSketchToQuantilesSketch",
    2. "name": <output name>,
    3. "field" : <post aggregator that refers to a DoublesSketch (fieldAccess or another post aggregator)>,
    4. "column" : <number>,
    5. }

    Set Operations

    Returns a result of a specified set operation on the given array of sketches. Supported operations are: union, intersection and set difference (UNION, INTERSECT, NOT).

    1. {
    2. "type" : "arrayOfDoublesSketchSetOp",
    3. "name": <output name>,
    4. "operation": <"UNION"|"INTERSECT"|"NOT">,
    5. "fields" : <array of post aggregators to access sketch aggregators or post aggregators to allow arbitrary combination of set operations>,
    6. "nominalEntries" : <parameter that determines the accuracy and size of the sketch>,
    7. "numberOfValues" : <number of values associated with each distinct key>
    8. }

    Student’s t-test

    Sketch summary

    Returns a human-readable summary of a given ArrayOfDoublesSketch. This is a string returned by toString() method of the sketch. This can be useful for debugging.

    1. {
    2. "type" : "arrayOfDoublesSketchToString",
    3. "name": <output name>,
    4. "field" : <post aggregator that refers to an ArrayOfDoublesSketch (fieldAccess or another post aggregator)>