Research
Security News
Quasar RAT Disguised as an npm Package for Detecting Vulnerabilities in Ethereum Smart Contracts
Socket researchers uncover a malicious npm package posing as a tool for detecting vulnerabilities in Etherium smart contracts.
tar-stream
Advanced tools
tar-stream is a streaming tar parser and generator and nothing else. It operates purely using streams which means you can easily extract/parse tarballs without ever hitting the file system.
The tar-stream npm package is a streaming tar parser and generator, which allows users to read and write tar archives in a streaming fashion. This means that you can process tar files without having to load the entire file into memory, which is useful for handling large files or for streaming applications.
Extracting a tar archive
This feature allows you to extract files from a tar archive. The 'entry' event is emitted for each file in the archive, providing the file header and a stream for the file content.
const extract = require('tar-stream').extract;
const fs = require('fs');
let extractor = extract();
extractor.on('entry', (header, stream, next) => {
// header is the tar header
// stream is the content body (might be an empty stream)
// call next when you are done with this entry
stream.on('end', () => next());
stream.resume(); // just auto drain the stream
});
fs.createReadStream('archive.tar').pipe(extractor);
Creating a tar archive
This feature allows you to create a tar archive. You can add entries to the archive with the 'entry' method, and then finalize the archive when you are done.
const pack = require('tar-stream').pack;
const fs = require('fs');
let packer = pack();
// add a file called my-test.txt with the content 'Hello World!'
packer.entry({ name: 'my-test.txt' }, 'Hello World!', (err) => {
if (err) throw err;
packer.finalize(); // finalize the archive when you are done
});
// pipe the pack stream somewhere, like to a file
packer.pipe(fs.createWriteStream('my-tarball.tar'));
Archiver is a high-level streaming archive library that supports creating TAR and ZIP archives. It provides more abstraction than tar-stream and includes additional features like appending files from streams, buffers, or directories, and setting global archive options.
tar-fs is a Node.js module that provides filesystem bindings for tar-stream. It allows you to pack directories into tarballs and extract tarballs into directories using a file system interface, making it a bit more convenient for certain use cases compared to the lower-level tar-stream.
The 'tar' package is a full-featured Tar for Node.js, which includes utilities for creating, manipulating, and extracting tar archives. It's a higher-level package compared to tar-stream and includes features like gzip compression and decompression.
tar-stream is a streaming tar parser and generator and nothing else. It operates purely using streams which means you can easily extract/parse tarballs without ever hitting the file system.
Note that you still need to gunzip your data if you have a .tar.gz
. We recommend using gunzip-maybe in conjunction with this.
npm install tar-stream
tar-stream exposes two streams, pack which creates tarballs and extract which extracts tarballs. To modify an existing tarball use both.
It implementes USTAR with additional support for pax extended headers. It should be compatible with all popular tar distributions out there (gnutar, bsdtar etc)
If you want to pack/unpack directories on the file system check out tar-fs which provides file system bindings to this module.
To create a pack stream use tar.pack()
and call pack.entry(header, [callback])
to add tar entries.
const tar = require('tar-stream')
const pack = tar.pack() // pack is a stream
// add a file called my-test.txt with the content "Hello World!"
pack.entry({ name: 'my-test.txt' }, 'Hello World!')
// add a file called my-stream-test.txt from a stream
const entry = pack.entry({ name: 'my-stream-test.txt', size: 11 }, function(err) {
// the stream was added
// no more entries
pack.finalize()
})
entry.write('hello')
entry.write(' ')
entry.write('world')
entry.end()
// pipe the pack stream somewhere
pack.pipe(process.stdout)
To extract a stream use tar.extract()
and listen for extract.on('entry', (header, stream, next) )
const extract = tar.extract()
extract.on('entry', function (header, stream, next) {
// header is the tar header
// stream is the content body (might be an empty stream)
// call next when you are done with this entry
stream.on('end', function () {
next() // ready for next entry
})
stream.resume() // just auto drain the stream
})
extract.on('finish', function () {
// all entries read
})
pack.pipe(extract)
The tar archive is streamed sequentially, meaning you must drain each entry's stream as you get them or else the main extract stream will receive backpressure and stop reading.
The extraction stream in addition to being a writable stream is also an async iterator
const extract = tar.extract()
someStream.pipe(extract)
for await (const entry of extract) {
entry.header // the tar header
entry.resume() // the entry is the stream also
}
The header object using in entry
should contain the following properties.
Most of these values can be found by stat'ing a file.
{
name: 'path/to/this/entry.txt',
size: 1314, // entry size. defaults to 0
mode: 0o644, // entry mode. defaults to to 0o755 for dirs and 0o644 otherwise
mtime: new Date(), // last modified date for entry. defaults to now.
type: 'file', // type of entry. defaults to file. can be:
// file | link | symlink | directory | block-device
// character-device | fifo | contiguous-file
linkname: 'path', // linked file name
uid: 0, // uid of entry owner. defaults to 0
gid: 0, // gid of entry owner. defaults to 0
uname: 'maf', // uname of entry owner. defaults to null
gname: 'staff', // gname of entry owner. defaults to null
devmajor: 0, // device major version. defaults to 0
devminor: 0 // device minor version. defaults to 0
}
Using tar-stream it is easy to rewrite paths / change modes etc in an existing tarball.
const extract = tar.extract()
const pack = tar.pack()
const path = require('path')
extract.on('entry', function (header, stream, callback) {
// let's prefix all names with 'tmp'
header.name = path.join('tmp', header.name)
// write the new entry to the pack stream
stream.pipe(pack.entry(header, callback))
})
extract.on('finish', function () {
// all entries done - lets finalize it
pack.finalize()
})
// pipe the old tarball to the extractor
oldTarballStream.pipe(extract)
// pipe the new tarball the another stream
pack.pipe(newTarballStream)
const fs = require('fs')
const tar = require('tar-stream')
const pack = tar.pack() // pack is a stream
const path = 'YourTarBall.tar'
const yourTarball = fs.createWriteStream(path)
// add a file called YourFile.txt with the content "Hello World!"
pack.entry({ name: 'YourFile.txt' }, 'Hello World!', function (err) {
if (err) throw err
pack.finalize()
})
// pipe the pack stream to your file
pack.pipe(yourTarball)
yourTarball.on('close', function () {
console.log(path + ' has been written')
fs.stat(path, function(err, stats) {
if (err) throw err
console.log(stats)
console.log('Got file info successfully!')
})
})
See tar-fs for a performance comparison with node-tar
MIT
FAQs
tar-stream is a streaming tar parser and generator and nothing else. It operates purely using streams which means you can easily extract/parse tarballs without ever hitting the file system.
The npm package tar-stream receives a total of 27,231,279 weekly downloads. As such, tar-stream popularity was classified as popular.
We found that tar-stream demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket researchers uncover a malicious npm package posing as a tool for detecting vulnerabilities in Etherium smart contracts.
Security News
Research
A supply chain attack on Rspack's npm packages injected cryptomining malware, potentially impacting thousands of developers.
Research
Security News
Socket researchers discovered a malware campaign on npm delivering the Skuld infostealer via typosquatted packages, exposing sensitive data.