Socket
Book a DemoInstallSign in
Socket

cvm-lib

Package Overview
Dependencies
Maintainers
1
Versions
8
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

cvm-lib

Estimate the number of distinct values in a set using the simple and space-efficient CVM algorithm

0.1.2
latest
Source
npmnpm
Version published
Weekly downloads
21
110%
Maintainers
1
Weekly downloads
 
Created
Source

CVM Library

Estimate the number of distinct values in a set using the simple and space-efficient CVM algorithm.

Version JSR Maintenance License codecov npm bundle size

Getting Started

Install

NPM:

npm install cvm-lib

Yarn:

yarn add cvm-lib

JSR:

jsr add @rojas/cvm

Examples

See the examples/ directory for all examples.

Hamlet

Estimate unique words in Shakespeare's Hamlet:

node ./examples/hamlet/index.js
  • Total words: 31991
  • CVM capacity: 2161
  • Expected uniques: 4762 ± 10.00%
  • Estimated uniques: 4728 (-0.71%)

1M

Estimate unique integers among 1 million random integers.

node ./examples/hamlet/index.js
  • Total values: 1000000
  • CVM capacity: 10631
  • Expected uniques: 994384 ± 5.00%
  • Estimated uniques: 996480 (0.21%)

API

Functions

calculateCapacity(n, epsilon?, delta?)

Calculates the space required to estimate the number of distinct values in a set with a given accuracy and confidence.

  • n: The total number of values in the set, or an estimate if unknown. Must be a positive number.
  • epsilon (optional): An estimate's relative error. Controls accuracy. Must be between 0 and 1. Defaults to 0.05.
  • delta (optional): The probability an estimate is not accurate. Controls confidence. Must be between 0 and 1. Defaults to 0.01.

Classes

Estimator<T>

Estimates the number of distinct values in a set using the CVM algorithm.

  • Constructors

    • new (capacity): Create an instance with a given capacity. Must be a positive integer.
    • new (config): Create an instance using a config object.
  • Properties

    • capacity: Gets the maximum number of samples in memory.
    • randomFn: Gets or sets the random number generator function (e.g. Math.random).
    • sampleRate Gets the base sample rate (e.g. 0.5).
    • size: Gets the number of samples in memory.
  • Methods

    • add(value): Adds a value.
    • clear(): Clears/resets the instance.
    • estimate(): Gets the estimated number of distinct values.

Interfaces

EstimatorConfig<T>

A configuration object used to create Estimator instances.

  • capacity: The maximum number of samples in memory. Must be a positive integer.
  • randomFn (optional): The random number generator function. Should return random or pseudorandom values between 0 and 1.
  • sampleRate (optional): The sampling rate for managing samples. Must be between 0 and 1.
    • Note: Custom values may negatively affect accuracy. In general, the further from 0.5, the more it's affected. If capacity was calculated via calculateCapacity, expected accuracy / confidence may be invalidated.
  • storage (optional): An object that implements SampleSet for storing samples.

SampleSet<T>

Represents a generic set for storing samples.

  • size: The number of values in the set.
  • add(value): Adds a value to the set.
  • clear(): Clears all values from the set.
  • delete(value): Removes a specified value from the set.
  • [Symbol.iterator](): Iterates through the set's values.

Community and Support

Contributions are welcome!

  • Questions / Dicussions: Please contact us via GitHub discussions.

  • Bug Reports: Please use the GitHub issue tracker to report any bugs. Include a detailed description and any relevant code snippets or logs.

  • Feature Requests: Please submit feature requests as issues, clearly describing the feature and its potential benefits.

  • Pull Requests: Please ensure your code adheres to the existing style of the project and include any necessary tests and documentation.

For more information, check out the contributor's guide.

Build

  • Clone the project from github
git clone git@github.com:havelessbemore/cvm-lib.git
cd cvm-lib
  • Install dependencies
npm install
  • Build the project
npm run build

This will output ECMAScript (.mjs) and CommonJS (.cjs) modules in the dist/ directory.

Format

To run the code linter:

npm run lint

To automatically fix linting issues, run:

npm run format

Test

To run tests:

npm test

To run tests with a coverage report:

npm run test:coverage

A coverage report is generated at ./coverage/index.html.

References

Keywords

algorithm

FAQs

Package last updated on 04 Aug 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

About

Packages

Stay in touch

Get open source security insights delivered straight into your inbox.

  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc

U.S. Patent No. 12,346,443 & 12,314,394. Other pending.