tar-stream
tar-stream is a streaming tar parser and generator and nothing else. It is streams2 and operates purely using streams which means you can easily extract/parse tarballs without ever hitting the file system.
Note that you still need to gunzip your data if you have a .tar.gz
. We recommend using gunzip-maybe in conjunction with this.
npm install tar-stream
Usage
tar-stream exposes two streams, pack which creates tarballs and extract which extracts tarballs. To modify an existing tarball use both.
It implementes USTAR with additional support for pax extended headers. It should be compatible with all popular tar distributions out there (gnutar, bsdtar etc)
Related
If you want to pack/unpack directories on the file system check out tar-fs which provides file system bindings to this module.
Packing
To create a pack stream use tar.pack()
and call pack.entry(header, [callback])
to add tar entries.
var tar = require('tar-stream')
var pack = tar.pack()
pack.entry({ name: 'my-test.txt' }, 'Hello World!')
var entry = pack.entry({ name: 'my-stream-test.txt', size: 11 }, function(err) {
pack.finalize()
})
entry.write('hello')
entry.write(' ')
entry.write('world')
entry.end()
pack.pipe(process.stdout)
To extract a stream use tar.extract()
and listen for extract.on('entry', (header, stream, next) )
var extract = tar.extract()
extract.on('entry', function(header, stream, next) {
stream.on('end', function() {
next()
})
stream.resume()
})
extract.on('finish', function() {
})
pack.pipe(extract)
The tar archive is streamed sequentially, meaning you must drain each entry's stream as you get them or else the main extract stream will receive backpressure and stop reading.
The header object using in entry
should contain the following properties.
Most of these values can be found by stat'ing a file.
{
name: 'path/to/this/entry.txt',
size: 1314,
mode: 0644,
mtime: new Date(),
type: 'file',
linkname: 'path',
uid: 0,
gid: 0,
uname: 'maf',
gname: 'staff',
devmajor: 0,
devminor: 0
}
Modifying existing tarballs
Using tar-stream it is easy to rewrite paths / change modes etc in an existing tarball.
var extract = tar.extract()
var pack = tar.pack()
var path = require('path')
extract.on('entry', function(header, stream, callback) {
header.name = path.join('tmp', header.name)
stream.pipe(pack.entry(header, callback))
})
extract.on('finish', function() {
pack.finalize()
})
oldTarballStream.pipe(extract)
pack.pipe(newTarballStream)
Saving tarball to fs
var fs = require('fs')
var tar = require('tar-stream')
var pack = tar.pack()
var path = 'YourTarBall.tar'
var yourTarball = fs.createWriteStream(path)
pack.entry({name: 'YourFile.txt'}, 'Hello World!', function (err) {
if (err) throw err
pack.finalize()
})
pack.pipe(yourTarball)
yourTarball.on('close', function () {
console.log(path + ' has been written')
fs.stat(path, function(err, stats) {
if (err) throw err
console.log(stats)
console.log('Got file info successfully!')
})
})
Performance
See tar-fs for a performance comparison with node-tar
License
MIT