Security News
Bun 1.2 Released with 90% Node.js Compatibility and Built-in S3 Object Support
Bun 1.2 enhances its JavaScript runtime with 90% Node.js compatibility, built-in S3 and Postgres support, HTML Imports, and faster, cloud-first performance.
object-hash-set
Advanced tools
This is a hash set for Javascript objects. Its object-encoding algorithm is quite space-efficient, especially in the case where the input objects have largely the same keys, and most or all keys have low cardinality (number of distinct values of that key in the data set). Data with these characteristics often arises in time series data analysis applications, which is where this originated.
Creates an instance of the Object Hash Set. options
, if specified, is an object. The only supported option is ignore
. The value of ignore
is an array of keys that the set will not pay attention to during storage or lookup. The set will consider two objects identical if their values for all non-ignored keys are the same.
Adds the given object to the set if an equivalent object is not already in the set. Returns true
if a new object was added or false
if the object already existed in the set.
Returns true
if an object equivalent to object
has already been add
ed.
Removes the object from the set. A future call to add
or contains
with an object identical to the given object will return false
. Note that this will not reclaim the storage space used by the keys in the given object.
Object Hash Set works its magic by storing each distinct value of each key once and compactly encoding combinations of keys with references to these stored values. You can use the provided scripts/perf.js
to give it a test. perf.js
takes two parameters: num_keys
and values_per_key
. It generates a data set of (values_per_key
^num_keys
) distinct points, adds them all to an Object Hash Set, and periodically logs memory stats. Here's an example:
node --expose-gc ./scripts/perf.js --num_keys 7 --values_per_key 10
stored 0 points so far in 0.016 sec, memory usage: { rss: 20054016, heapTotal: 7523616, heapUsed: 4344872 }
stored 100000 points so far in 1.119 sec, memory usage: { rss: 30150656, heapTotal: 10619424, heapUsed: 6892416 }
...
stored 9900000 points so far in 111.106 sec, memory usage: { rss: 490704896, heapTotal: 10619424, heapUsed: 6225264 }
Finished! Stored 10000000 points, final memory usage: { rss: 494194688, heapTotal: 10619424, heapUsed: 4654384 }
That comes out to 10,000,000 objects stored, taking up 494,194,688 bytes of RSS space (since Object Hash Set is a native C++ addon, it doesn't take up space in the Javascript heap for the objects it stores). If you naively hash these objects with JSON.stringify
and store them as keys in a plain old Javascript object, the heap usage goes to 1.5 GB and the program crashes at around 6.5 million points. So Object Hash Set is almost 5 times more efficient. Nice!
Want to contribute? Awesome! Don’t hesitate to file an issue or open a pull request. See the common contributing guidelines for project Juttle.
FAQs
memory efficient object hash set
The npm package object-hash-set receives a total of 2 weekly downloads. As such, object-hash-set popularity was classified as not popular.
We found that object-hash-set demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Bun 1.2 enhances its JavaScript runtime with 90% Node.js compatibility, built-in S3 and Postgres support, HTML Imports, and faster, cloud-first performance.
Security News
Biden's executive order pushes for AI-driven cybersecurity, software supply chain transparency, and stronger protections for federal and open source systems.
Security News
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.