yauzl
yet another unzip library for node.
Design principles:
- Follow the spec.
Don't scan for local file headers.
Read the central directory for file metadata.
- Don't block the JavaScript thread.
Use and provide async APIs.
- Keep memory usage under control.
Don't attempt to buffer entire files in RAM at once.
- Never crash (if used properly).
Don't let malformed zip files bring down client applications who are trying to catch errors.
- Catch unsafe filenames entries.
A zip file entry throws an error if its file name starts with
"/"
or /[A-Za-z]:\//
or if it contains ".."
path segments or "\\"
(per the spec).
Usage
var yauzl = require("yauzl");
var fs = require("fs");
yauzl.open("path/to/file.zip", function(err, zipfile) {
if (err) throw err;
zipfile.on("entry", function(entry) {
if (/\/$/.test(entry.fileName)) {
return;
}
zipfile.openReadStream(entry, function(err, readStream) {
if (err) throw err;
readStream.pipe(fs.createWriteStream(entry.fileName));
});
});
});
API
The default for every callback
parameter is:
function defaultCallback(err) {
if (err) throw err;
}
open(path, [options], [callback])
Calls fs.open(path, "r")
and gives the fd
, options
, and callback
to fromFd
below.
options
may be omitted or null
and defaults to {autoClose: true}
.
fromFd(fd, [options], [callback])
Reads from the fd, which is presumed to be an open .zip file.
Note that random access is required by the zip file specification,
so the fd cannot be an open socket or any other fd that does not support random access.
The callback
is given the arguments (err, zipfile)
.
An err
is provided if the End of Central Directory Record Signature cannot be found in the file,
which indicates that the fd is not a zip file.
zipfile
is an instance of ZipFile
.
options
may be omitted or null
and defaults to {autoClose: false}
.
autoClose
is effectively equivalent to:
zipfile.once("end", function() {
zipfile.close();
});
fromBuffer(buffer, [callback])
Like fromFd
, but reads from a RAM buffer instead of an open file.
buffer
is a Buffer
.
callback
is effectively passed directly to fromFd
.
If a ZipFile
is acquired from this method,
it will never emit the close
event,
and calling close()
is not necessary.
dosDateTimeToDate(date, time)
Converts MS-DOS date
and time
data into a JavaScript Date
object.
Each parameter is a Number
treated as an unsigned 16-bit integer.
Note that this format does not support timezones,
so the returned object will use the local timezone.
Class: ZipFile
The constructor for the class is not part of the public API.
Use open
, fromFd
, or fromBuffer
instead.
Event: "entry"
Callback gets (entry)
, which is an Entry
.
Event: "end"
Emitted after the last entry
event has been emitted.
Event: "close"
Emitted after the fd is actually closed.
This is after calling close
(or after the end
event when autoClose
is true
),
and after all streams created from openReadStream
have emitted their end
events.
This event is never emitted if this ZipFile
was acquired from fromBuffer()
.
openReadStream(entry, [callback])
entry
must be an Entry
object from this ZipFile
.
callback
gets (err, readStream)
, where readStream
is a Readable Stream
.
If the entry is compressed (with a supported compression method),
the read stream provides the decompressed data.
If this zipfile is already closed (see close
), the callback
will receive an err
.
close([callback])
Causes all future calls to openReadStream
to fail,
and calls fs.close(fd, callback)
after all streams created by openReadStream
have emitted their end
events.
If this object's end
event has not been emitted yet, this function causes undefined behavior.
If autoClose
is true
in the original open
or fromFd
call,
this function will be called automatically effectively in response to this object's end
event.
isOpen
Boolean
. true
until close
is called; then it's false
.
entryCount
Number
. Total number of central directory records.
String
. Always decoded with CP437
per the spec.
Class: Entry
Objects of this class represent Central Directory Records.
Refer to the zipfile specification for more details about these fields.
These fields are of type Number
:
versionMadeBy
versionNeededToExtract
generalPurposeBitFlag
compressionMethod
lastModFileTime
(MS-DOS format, see getLastModDateTime
)lastModFileDate
(MS-DOS format, see getLastModDateTime
)crc32
compressedSize
uncompressedSize
fileNameLength
(bytes)extraFieldLength
(bytes)fileCommentLength
(bytes)internalFileAttributes
externalFileAttributes
relativeOffsetOfLocalHeader
fileName
String
.
Following the spec, the bytes for the file name are decoded with
UTF-8
if generalPurposeBitFlag & 0x800
, otherwise with CP437
.
If fileName
would contain unsafe characters, such as an absolute path or
a relative directory, yauzl emits an error instead of an entry.
Array
with each entry in the form {id: id, data: data}
,
where id
is a Number
and data
is a Buffer
.
None of the extra fields are considered significant by this library.
String
decoded with the same charset as used for fileName
.
getLastModDate()
Effectively implemented as:
return dosDateTimeToDate(this.lastModFileDate, this.lastModFileTime);
How to Avoid Crashing
When a malformed zipfile is encountered, the default behavior is to crash (throw an exception).
If you want to handle errors more gracefully than this,
be sure to do the following:
- Provide
callback
parameters where they are allowed, and check the err
parameter. - Attach a listener for the
error
event on any ZipFile
object you get from open
, fromFd
, or fromBuffer
. - Attach a listener for the
error
event on any stream you get from openReadStream
.
Limitations
No Multi-Disk Archive Support
This library does not support multi-disk zip files.
The multi-disk fields in the zipfile spec were intended for a zip file to span multiple floppy disks,
which probably never happens now.
If the "number of this disk" field in the End of Central Directory Record is not 0
,
the open
, fromFd
, or fromBuffer
callback
will receive an err
.
By extension the following zip file fields are ignored by this library and not provided to clients:
- Disk where central directory starts
- Number of central directory records on this disk
- Disk number where file starts
No Encryption Support
Currently, the presence of encryption is not even checked,
and encrypted zip files will cause undefined behavior.
Many unzip libraries mistakenly read the Local File Header data in zip files.
This data is officially defined to be redundant with the Central Directory information,
and is not to be trusted.
There may be conflicts between the Central Directory information and the Local File Header,
but the Local File Header is always ignored.
No CRC-32 Checking
This library provides the crc32
field of Entry
objects read from the Central Directory.
However, this field is not used for anything in this library.
The field versionNeededToExtract
is ignored,
because this library doesn't support the complete zip file spec at any version,
No Support For Obscure Compression Methods
Regarding the compressionMethod
field of Entry
objects,
only method 0
(stored with no compression)
and method 8
(deflated) are supported.
Any of the other 15 official methods will cause the openReadStream
callback
to receive an err
.
No ZIP64 Support
A ZIP64 file will probably cause undefined behavior.
Data Descriptors Are Ignored
There may or may not be Data Descriptor sections in a zip file.
This library provides no support for finding or interpreting them.
There may or may not be an Archive Extra Data Record section in a zip file.
This library provides no support for finding or interpreting it.
No Language Encoding Flag Support
Zip files officially support charset encodings other than CP437 and UTF-8,
but the zip file spec does not specify how it works.
This library makes no attempt to interpret the Language Encoding Flag.