Research
Security News
Quasar RAT Disguised as an npm Package for Detecting Vulnerabilities in Ethereum Smart Contracts
Socket researchers uncover a malicious npm package posing as a tool for detecting vulnerabilities in Etherium smart contracts.
javascript implementation of Dunning's T-Digest for streaming quantile approximation
The tdigest npm package is a JavaScript implementation of Ted Dunning's t-digest algorithm, which is used for accurate online accumulation of rank-based statistics such as quantiles and cumulative distribution functions (CDFs). It is particularly useful for large-scale data processing where memory efficiency and speed are critical.
Creating a t-digest
This feature allows you to create a new t-digest instance, which can then be used to accumulate data and compute statistics.
const TDigest = require('tdigest').TDigest;
const t = new TDigest();
Adding data to the t-digest
This feature allows you to add data points to the t-digest. The t-digest will then use these data points to compute rank-based statistics.
t.push(1.0);
t.push(2.0);
t.push(3.0);
Computing quantiles
This feature allows you to compute quantiles from the data accumulated in the t-digest. For example, you can compute the median (50th percentile) or the 90th percentile.
const q50 = t.percentile(0.5);
const q90 = t.percentile(0.9);
Computing cumulative distribution function (CDF)
This feature allows you to compute the cumulative distribution function (CDF) for a given value. The CDF represents the probability that a random variable takes on a value less than or equal to the given value.
const cdf = t.cdf(2.0);
The simple-statistics package provides a wide range of statistical functions, including quantile calculations. However, it does not implement the t-digest algorithm and may not be as memory efficient for large-scale data processing.
The summary-statistics package offers basic statistical summaries such as mean, median, and standard deviation. It does not provide the advanced rank-based statistics or memory efficiency of the t-digest algorithm.
The descriptive-statistics package provides functions for descriptive statistics, including quantiles. While it offers similar functionality, it does not use the t-digest algorithm and may not be as efficient for large datasets.
Javascript implementation of Dunning's T-Digest for streaming quantile approximation
The T-Digest is a data structure and algorithm for constructing an approximate distribution for a collection of real numbers presented as a stream. The algorithm makes no guarantees, but behaves well enough in practice that implementations have been included in Apache Mahout and ElasticSearch for computing summaries and approximate order statistics over a stream.
For an overview of T-Digest's behavior, see Davidson-Pilon's blog post regarding a python implementation. For more details, there are the tdigest paper and reference implementation (Java). This javascript implementation is based on a reading of the paper, with some boundary and performance tweaks.
changes in 0.1.2:
Updated the bintree dependency to 1.0.2 to pick up its licencing declaration
changes in 0.1.1:
percentile on an empty digest returns undefined or array of undefined instead of NaN
upgraded bintrees to get bugfix.
bugfix for discrete percentile and p_rank, make boundary conditions conform to standard definition.
changes in 0.1.0:
Discrete mode: when a TDigest is created with delta=false, the sample distribution is treated as discrete. TDigest behavior is disabled, differing samples are never merged (they needn't even be numeric), and percentiles are reported as nearest exact data values rather than interpolated.
Digest: distribution digest structure. Starts in exact histogram (discrete) mode, remains in exact mode for reasonable numbers of distinct values as sample size inreases, and automatically switches to TDigest mode for large samples that appear to be from a continuous distribution.
Renamed quantile() -> p_rank(), Percentile Rank.
percentile() and p_rank() now accept arrays or singleton arguments.
changes in 0.0.7:
A grunt dist
task has been added to create a UMD-wrapped version of tdigest
and dependencies for importing as a standalone module in client-side javascript.
bugfixes and speed improvements.
changes in 0.0.5:
API Overhaul:
bugfixes and speed improvements.
npm install tdigest
var TDigest = require('tdigest').TDigest;
var x=[], N = 100000;
for (var i = 0 ; i < N ; i += 1) {
x.push(Math.random() * 10 - 5);
};
td = new TDigest();
td.push(x);
td.compress();
console.log(td.summary());
console.log("median ~ "+td.percentile(0.5));
See also example.js in this package.
The grunt dist
task has been configured to generate
a self-contained UMD-wrapped version of tdigest in dist/tdigest.js.
Embed it in HTML like this:
<script src="dist/tdigest.js"></script>
<script>
var td = new this.tdigest.TDigest();
for (var i=0; i < 1000000; i++) {
td.push(Math.random());
}
td.compress();
document.write(td.summary())
</script>
See also example.html in this package.
bintrees
: https://www.npmjs.com/package/bintrees
FAQs
javascript implementation of Dunning's T-Digest for streaming quantile approximation
The npm package tdigest receives a total of 860,776 weekly downloads. As such, tdigest popularity was classified as popular.
We found that tdigest demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket researchers uncover a malicious npm package posing as a tool for detecting vulnerabilities in Etherium smart contracts.
Security News
Research
A supply chain attack on Rspack's npm packages injected cryptomining malware, potentially impacting thousands of developers.
Research
Security News
Socket researchers discovered a malware campaign on npm delivering the Skuld infostealer via typosquatted packages, exposing sensitive data.