What is unzipper?
The unzipper npm package is a module that provides streaming APIs for unzipping .zip files. It allows for extracting the contents of zip files, parsing zip file structures, and more, all while being memory efficient and fast.
What are unzipper's main functionalities?
Extracting zip files
This feature allows you to extract the contents of a zip file to a specified directory. The code sample demonstrates how to read a zip file as a stream and pipe it to the unzipper's Extract method, which will extract the files to the given path.
const unzipper = require('unzipper');
const fs = require('fs');
fs.createReadStream('path/to/archive.zip')
.pipe(unzipper.Extract({ path: 'output/path' }));
Parsing zip file entries
This feature allows you to parse the contents of a zip file and work with each entry individually. The code sample shows how to read entries from a zip file and handle them based on their type, either extracting files or draining directories.
const unzipper = require('unzipper');
fs.createReadStream('path/to/archive.zip')
.pipe(unzipper.Parse())
.on('entry', function (entry) {
const fileName = entry.path;
const type = entry.type; // 'Directory' or 'File'
if (type === 'File') {
entry.pipe(fs.createWriteStream('output/path/' + fileName));
} else {
entry.autodrain();
}
});
Buffer-based extraction
This feature allows you to extract files from a zip file that is already loaded into a buffer. The code sample demonstrates how to open a zip file from a buffer and then extract the contents of the first file into another buffer.
const unzipper = require('unzipper');
unzipper.Open.buffer(buffer)
.then(function (directory) {
return directory.files[0].buffer();
})
.then(function (contentBuffer) {
// Use the contentBuffer
});
Other packages similar to unzipper
adm-zip
adm-zip is a JavaScript implementation for zip data compression for NodeJS. It provides functionalities similar to unzipper, such as reading and extracting zip files. Unlike unzipper, which is stream-based, adm-zip works with in-memory buffers, which can be less efficient for large files.
jszip
jszip is a library for creating, reading, and editing .zip files with JavaScript, with a focus on client-side use. It can be used in NodeJS as well. It offers a more comprehensive API for handling zip files compared to unzipper, including the ability to generate zip files, but it might not be as optimized for streaming large zip files.
yauzl
yauzl is another NodeJS library for reading zip files. It focuses on low-level zip file parsing and decompression, providing a minimal API. It's designed to be more memory efficient than adm-zip by using lazy parsing, but it doesn't provide the high-level convenience methods that unzipper does.
unzipper
This is a fork of node-unzip which has not been maintained in a while. This fork addresses the following issues:
- finish/close events are not always triggered, particular when the input stream is slower than the receivers
- Any files are buffered into memory before passing on to entry
The stucture of this fork is similar to the original, but uses Promises and inherit guarantees provided by node streams to ensure low memory footprint and guarantee finish/close events at the end of processing. The new Parser
will push any parsed entries
downstream if you pipe from it, while still supporting the legacy entry
event as well.
Breaking changes: The new Parser
will not automatically drain entries if there are no listeners or pipes in place.
Unzipper provides simple APIs similar to node-tar for parsing and extracting zip files.
There are no added compiled dependencies - inflation is handled by node.js's built in zlib support.
Installation
$ npm install unzipper
Quick Examples
fs.createReadStream('path/to/archive.zip')
.pipe(unzipper.Extract({ path: 'output/path' }));
Extract emits the 'close' event once the zip's contents have been fully extracted to disk.
Parse zip file contents
Process each zip file entry or pipe entries to another stream.
Important: If you do not intend to consume an entry stream's raw data, call autodrain() to dispose of the entry's
contents. Otherwise you the stream will halt.
fs.createReadStream('path/to/archive.zip')
.pipe(unzipper.Parse())
.on('entry', function (entry) {
var fileName = entry.path;
var type = entry.type;
var size = entry.size;
if (fileName === "this IS the file I'm looking for") {
entry.pipe(fs.createWriteStream('output/path'));
} else {
entry.autodrain();
}
});
Parse zip by piping entries downstream
If you pipe
from unzipper the downstream components will receive each entry
for further processing. This allows for clean pipelines transforming zipfiles into unzipped data.
Example using stream.Transform
:
fs.createReadStream('path/to/archive.zip')
.pipe(unzipper.Parse())
.pipe(stream.Transform({
objectMode: true,
_transform: function(entry,e,cb) {
var fileName = entry.path;
var type = entry.type;
var size = entry.size;
if (fileName === "this IS the file I'm looking for") {
entry.pipe(fs.createWriteStream('output/path'))
.on('finish',cb);
} else {
entry.autodrain();
cb();
}
}
}
}));
Example using etl:
fs.createReadStream('path/to/archive.zip')
.pipe(unzipper.Parse())
.pipe(etl.map(entry => {
if (entry.path == "this IS the file I'm looking for")
return entry
.pipe(etl.toFile('output/path'))
.promise();
else
entry.autodrain();
}))
Parse a single file and pipe contents
unzipper.parseOne([regex])
is a convenience method that unzips only one file from the archive and pipes the contents down (not the entry itself). If no serch criteria is specified, the first file in the archive will be unzipped. Otherwise, each filename will be compared to the criteria and the first one to match will be unzipped and piped down. If no file matches then the the stream will end without any content.
Example:
fs.createReadStream('path/to/archive.zip')
.pipe(unzipper.ParseOne())
.pipe(fs.createReadStream('firstFile.txt'));
Buffering the content of an entry into memory
While the recommended strategy of consuming the unzipped contents is using streams, it is sometimes convenient to be able to get the full buffered contents of each file . Each entry
provides a .buffer
function that consumes the entry by buffering the contents into memory and returning a promise to the complete buffer.
fs.createReadStream('path/to/archive.zip')
.pipe(unzipper.Parse())
.pipe(etl.map(entry => {
if (entry.path == "this IS the file I'm looking for")
entry
.buffer()
.then(content => fs.writeFile('output/path',content))
else
entry.autodrain();
}))
Parse.promise() syntax sugar
The parser emits finish
and error
events like any other stream. The parser additionally provides a promise wrapper around those two events to allow easy folding into existing Promise based structures.
Example:
fs.createReadStream('path/to/archive.zip')
.pipe(unzipper.Parse()
.on('entry', entry => entry.autodrain())
.promise()
.then( () => console.log('done'), e => console.log('error',e));
Licenses
See LICENCE