Security News
Node.js EOL Versions CVE Dubbed the "Worst CVE of the Year" by Security Experts
Critics call the Node.js EOL CVE a misuse of the system, sparking debate over CVE standards and the growing noise in vulnerability databases.
The split2 npm package is a Node.js module that allows you to split a stream of text into lines, handling backpressure correctly. It's particularly useful for reading and processing large text files or logs line by line without loading the entire file into memory. This can significantly improve performance and efficiency when dealing with large datasets or streams.
Splitting a stream into lines
This feature allows you to read a file stream and split it into lines. Each line is then processed individually, which is useful for parsing logs or any line-based data.
"use strict";
const split2 = require('split2');
const fs = require('fs');
fs.createReadStream('file.txt')
.pipe(split2())
.on('data', function (line) {
console.log(line);
});
Using a custom matcher
This demonstrates how to use a custom regular expression as a matcher to split the stream. This is useful when lines are not just separated by a newline character but could include carriage returns or other variations.
"use strict";
const split2 = require('split2');
process.stdin
.pipe(split2(/(?:\r\n|\r|\n)/g))
.on('data', function (line) {
console.log('Line:', line);
});
The byline package offers similar functionality to split2 by providing a simple way to read lines from a stream. However, it focuses more on simplicity and ease of use, potentially at the cost of some of the more advanced features and customizations offered by split2.
While not an npm package but a core Node.js module, readline provides functionality to read data from a readable stream, such as the process.stdin, one line at a time. It's more complex and versatile than split2, offering more control over the input and output streams, but it might be overkill for simple line-splitting tasks.
Though not exclusively for splitting streams into lines, through2 is a tiny wrapper around Node streams.Transform that makes it easier to create transform streams. It can be used in combination with other methods to achieve similar functionality to split2, offering a more flexible but potentially more complex solution.
Break up a stream and reassemble it so that each line is a chunk.
split2
is inspired by @dominictarr split
module,
and it is totally API compatible with it.
However, it is based on Node.js core Transform
.
matcher
may be a String
, or a RegExp
. Example, read every line in a file ...
fs.createReadStream(file)
.pipe(split2())
.on('data', function (line) {
//each chunk now is a separate line!
})
split
takes the same arguments as string.split
except it defaults to '/\r?\n/', and the optional limit
paremeter is ignored.
String#split
split
takes an optional options object on it's third argument, which
is directly passed as a
Transform
option.
Additionally, the .maxLength
and .skipOverflow
options are implemented, which set limits on the internal
buffer size and the stream's behavior when the limit is exceeded. There is no limit unless maxLength
is set. When
the internal buffer size exceeds maxLength
, the stream emits an error by default. You may also set skipOverflow
to
true to suppress the error and instead skip past any lines that cause the internal buffer to exceed maxLength
.
Calling .destroy
will make the stream emit close
. Use this to perform cleanup logic
var splitFile = function(filename) {
var file = fs.createReadStream(filename)
return file
.pipe(split2())
.on('close', function() {
// destroy the file stream in case the split stream was destroyed
file.destroy()
})
}
var stream = splitFile('my-file.txt')
stream.destroy() // will destroy the input file stream
split2
accepts a function which transforms each line.
fs.createReadStream(file)
.pipe(split2(JSON.parse))
.on('data', function (obj) {
//each chunk now is a js object
})
.on("error", function(error) {
//handling parsing errors
})
However, in @dominictarr split
the mapper
is wrapped in a try-catch, while here it is not: if your parsing logic can throw, wrap it yourself. Otherwise, you can also use the stream error handling when mapper function throw.
Copyright (c) 2014-2021, Matteo Collina hello@matteocollina.com
Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
FAQs
split a Text Stream into a Line Stream, using Stream 3
We found that split2 demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Critics call the Node.js EOL CVE a misuse of the system, sparking debate over CVE standards and the growing noise in vulnerability databases.
Security News
cURL and Go security teams are publicly rejecting CVSS as flawed for assessing vulnerabilities and are calling for more accurate, context-aware approaches.
Security News
Bun 1.2 enhances its JavaScript runtime with 90% Node.js compatibility, built-in S3 and Postgres support, HTML Imports, and faster, cloud-first performance.