What is @datadog/sketches-js?
@datadog/sketches-js is a JavaScript library for working with data sketches, which are probabilistic data structures used for efficiently summarizing and querying large datasets. This package is particularly useful for applications that require real-time analytics and monitoring, such as those in observability and telemetry.
What are @datadog/sketches-js's main functionalities?
DDSketch
DDSketch is a data structure for accurately tracking quantiles in a data stream. The code sample demonstrates how to create a DDSketch, add data points to it, and query the median (50th percentile).
const { DDSketch } = require('@datadog/sketches-js');
const sketch = new DDSketch();
sketch.accept(10);
sketch.accept(20);
sketch.accept(30);
console.log(sketch.getQuantile(0.5));
Merge Sketches
This feature allows you to merge two DDSketch instances. The code sample shows how to create two sketches, add data points to them, and then merge them into one sketch.
const { DDSketch } = require('@datadog/sketches-js');
const sketch1 = new DDSketch();
sketch1.accept(10);
const sketch2 = new DDSketch();
sketch2.accept(20);
sketch1.merge(sketch2);
console.log(sketch1.getQuantile(0.5));
Serialization
Serialization allows you to convert a DDSketch to a format that can be stored or transmitted and then later reconstructed. The code sample demonstrates how to serialize and deserialize a DDSketch.
const { DDSketch } = require('@datadog/sketches-js');
const sketch = new DDSketch();
sketch.accept(10);
const serialized = sketch.serialize();
console.log(serialized);
const deserializedSketch = DDSketch.deserialize(serialized);
console.log(deserializedSketch.getQuantile(0.5));
Other packages similar to @datadog/sketches-js
tdigest
The 'tdigest' package provides a data structure for accurate online accumulation of rank-based statistics such as quantiles and cumulative distribution. It is similar to DDSketch in that it is used for summarizing large datasets, but it uses a different algorithm (t-digest) which may have different performance characteristics and accuracy trade-offs.
approximate
The 'approximate' package offers various probabilistic data structures for approximate counting, membership, and quantile estimation. It includes implementations of HyperLogLog, Count-Min Sketch, and Quantile Sketches. This package provides a broader range of data structures compared to @datadog/sketches-js, which focuses specifically on DDSketch.
sketches-js
This repo contains the TypeScript implementation of the distributed quantile sketch algorithm DDSketch. DDSketch is mergeable, meaning that multiple sketches from distributed systems can be combined in a central node.
Installation
The package is under @datadog/sketches-js and can be installed through NPM or Yarn:
npm install @datadog/sketches-js
yarn add @datadog/sketches-js
Usage
Initialize a sketch
To initialize a sketch with the default parameters:
import { DDSketch } from '@datadog/sketches-js';
const sketch = new DDSketch();
Modify the relativeAccuracy
If you want more granular control over how accurate the sketch's results will be, you can pass a relativeAccuracy
parameter when initializing a sketch.
Whereas other histograms use rank error guarantees (i.e. retrieving the p99 of the histogram will give you a value between p98.9 and p99.1), DDSketch uses a relative error guarantee (if the actual value at p99 is 100, the value will be between 99 and 101 for a relativeAccuracy
of 0.01).
This property makes DDSketch especially useful for long-tailed distributions of data, like measurements of latency.
import { DDSketch } from '@datadog/sketches-js';
const sketch = new DDSketch({
relativeAccuracy: 0.01,
});
Add values to a sketch
To add a number to a sketch, call sketch.accept(value)
. Both positive and negative numbers are supported.
const measurementOne = 1607374726;
const measurementTwo = 0;
const measurementThree = -3.1415;
sketch.accept(measurementOne);
sketch.accept(measurementTwo);
sketch.accept(measurementThree);
Retrieve measurements from the sketch
To retrieve measurements from a sketch, use sketch.getValueAtQuantile(quantile)
. Any number between 0 and 1 (inclusive) can be used as a quantile.
Additionally, common summary statistics are available such as sketch.min
, sketch.max
, sketch.sum
, and sketch.count
:
const measurementOne = 1607374726;
const measurementTwo = 0;
const measurementThree = -3.1415;
sketch.accept(measurementOne);
sketch.accept(measurementTwo);
sketch.accept(measurementThree);
sketch.getValueAtQuantile(0)
sketch.getValueAtQuantile(0.5)
sketch.getValueAtQuantile(0.99)
sketch.getValueAtQuantile(1)
sketch.min
sketch.max
sketch.count
sketch.sum
Merge multiple sketches
Independent sketches can be merged together, provided that they were initialized with the same relativeAccuracy
. This allows collecting and transmitting measurements in a distributed manner, and merging their results together while preserving the relativeAccuracy
guarantee.
import { DDSketch } from '@datadog/sketches-js';
const sketch1 = new DDSketch();
const sketch2 = new DDSketch();
[1,2,3,4,5].forEach(value => sketch1.accept(value));
[6,7,8,9,10].forEach(value => sketch2.accept(value));
sketch1.merge(sketch2);
sketch1.getValueAtQuantile(1)
References