What is unique-stream?
The unique-stream npm package is used to filter out duplicate objects in a stream based on a specified property or a custom function. It ensures that only unique items pass through the stream, which can be particularly useful when dealing with large datasets or streams of data where duplicates need to be removed.
What are unique-stream's main functionalities?
Filter unique objects by property
This feature allows you to filter out duplicate objects in a stream based on a specified property. In this example, objects with the same 'id' property will be considered duplicates, and only unique objects will be passed through the stream.
const unique = require('unique-stream');
const through = require('through2');
const stream = through.obj();
stream.pipe(unique('id')).pipe(through.obj((data, enc, callback) => {
console.log(data);
callback();
}));
stream.write({ id: 1, name: 'Alice' });
stream.write({ id: 2, name: 'Bob' });
stream.write({ id: 1, name: 'Alice' });
stream.end();
Filter unique objects by custom function
This feature allows you to filter out duplicate objects in a stream based on a custom function. In this example, objects with the same 'name' property will be considered duplicates, and only unique objects will be passed through the stream.
const unique = require('unique-stream');
const through = require('through2');
const stream = through.obj();
stream.pipe(unique((data) => data.name)).pipe(through.obj((data, enc, callback) => {
console.log(data);
callback();
}));
stream.write({ id: 1, name: 'Alice' });
stream.write({ id: 2, name: 'Bob' });
stream.write({ id: 3, name: 'Alice' });
stream.end();
Other packages similar to unique-stream
through2-filter
The through2-filter package provides a way to filter objects in a stream based on a custom function. While it does not specifically focus on removing duplicates, it can be used to achieve similar functionality by implementing a custom filter function to track and remove duplicates.
stream-filter
The stream-filter package allows you to filter objects in a stream using a custom function. Similar to through2-filter, it does not specifically target duplicate removal but can be used to filter out duplicates by implementing a custom filter function.
lodash.uniq
The lodash.uniq package is part of the Lodash library and provides a method to remove duplicate values from an array. While it is not specifically designed for streams, it can be used in conjunction with stream processing to achieve similar results by first collecting stream data into an array and then removing duplicates.
unique-stream
node.js through stream that emits a unique stream of objects based on criteria
Installation
Install via npm:
$ npm install unique-stream
Examples
Dedupe a ReadStream based on JSON.stringify:
var unique = require('unique-stream')
, Stream = require('stream');
function makeStreamOfObjects() {
var s = new Stream;
s.readable = true;
var count = 3;
for (var i = 0; i < 3; i++) {
setImmediate(function () {
s.emit('data', { name: 'Bob', number: 123 });
--count || end();
});
}
function end() {
s.emit('end');
}
return s;
}
makeStreamOfObjects()
.pipe(unique())
.on('data', console.log);
Dedupe a ReadStream based on an object property:
makeStreamOfObjects()
.pipe(unique('name'))
.on('data', console.log);
Dedupe a ReadStream based on a custom function:
makeStreamOfObjects()
.pipe(function (data) {
return data.number;
})
.on('data', console.log);
Dedupe multiple streams
The reason I wrote this was to dedupe multiple object streams:
var aggregator = unique();
makeStreamOfObjects()
.pipe(aggregator);
makeStreamOfObjects()
.pipe(aggregator);
makeStreamOfObjects()
.pipe(aggregator);
aggregator.on('data', console.log);
Use a custom store to record keys that have been encountered
By default a set is used to store keys encountered so far, in order to check new ones for
uniqueness. You can supply your own store instead, providing it supports the add(key) and
has(key) methods. This could allow you to use a persistent store so that already encountered
objects are not re-streamed when node is reloaded.
var keyStore = {
store: {},
add: function(key) {
this.store[key] = true;
},
has: function(key) {
return this.store[key] !== undefined;
}
};
makeStreamOfObjects()
.pipe(unique('name', keyStore))
.on('data', console.log);
Contributing
unique-stream is an OPEN Open Source Project. This means that:
Individuals making significant and valuable contributions are given commit-access to the project to contribute as they see fit. This project is more like an open wiki than a standard guarded open source project.
See the CONTRIBUTING.md file for more details.
Contributors
unique-stream is only possible due to the excellent work of the following contributors: