What is streamsearch?
The streamsearch npm package is primarily used for searching for a specific sequence of bytes in a streaming data source. It is particularly useful for parsing streams to find delimiters or separators, such as in multipart/form-data streams where boundaries need to be identified. This package provides a fast and efficient way to scan through data without buffering the entire stream, making it ideal for handling large volumes of data in real-time.
What are streamsearch's main functionalities?
Byte sequence searching in streams
This feature allows the user to search for a specific byte sequence within a stream. The example code initializes a StreamSearch instance with a needle ('example') and pushes a buffer to it. It listens for 'info' events to detect matches and outputs whether the needle was found in the provided data.
const StreamSearch = require('streamsearch');
let needle = Buffer.from('example');
let s = new StreamSearch(needle);
s.on('info', (isMatch, data, start, end) => {
console.log('Data:', data);
console.log('Match found:', isMatch);
});
// Simulate streaming data
s.push(Buffer.from('some example data containing the needle'));
Other packages similar to streamsearch
buffer-indexof
Similar to streamsearch, buffer-indexof provides functionality to find the index of a buffer within another buffer. However, it does not support streaming data inherently and is used for buffer objects in memory, which might not be as efficient for large or streaming datasets compared to streamsearch.
Description
streamsearch is a module for node.js that allows searching a stream using the Boyer-Moore-Horspool algorithm.
This module is based heavily on the Streaming Boyer-Moore-Horspool C++ implementation by Hongli Lai here.
Requirements
Installation
npm install streamsearch
Example
const { inspect } = require('util');
const StreamSearch = require('streamsearch');
const needle = Buffer.from('\r\n');
const ss = new StreamSearch(needle, (isMatch, data, start, end) => {
if (data)
console.log('data: ' + inspect(data.toString('latin1', start, end)));
if (isMatch)
console.log('match!');
});
const chunks = [
'foo',
' bar',
'\r',
'\n',
'baz, hello\r',
'\n world.',
'\r\n Node.JS rules!!\r\n\r\n',
];
for (const chunk of chunks)
ss.push(Buffer.from(chunk));
API
Properties
Functions
-
(constructor)(< mixed >needle, < function >callback) - Creates and returns a new instance for searching for a Buffer or string needle
. callback
is called any time there is non-matching data and/or there is a needle match. callback
will be called with the following arguments:
-
isMatch
- boolean - Indicates whether a match has been found
-
data
- mixed - If set, this contains data that did not match the needle.
-
start
- integer - The index in data
where the non-matching data begins (inclusive).
-
end
- integer - The index in data
where the non-matching data ends (exclusive).
-
isSafeData
- boolean - Indicates if it is safe to store a reference to data
(e.g. as-is or via data.slice()
) or not, as in some cases data
may point to a Buffer whose contents change over time.
-
destroy() - (void) - Emits any last remaining unmatched data that may still be buffered and then resets internal state.
-
push(< Buffer >chunk) - integer - Processes chunk
, searching for a match. The return value is the last processed index in chunk
+ 1.
-
reset() - (void) - Resets internal state. Useful for when you wish to start searching a new/different stream for example.