exiftool-vendored
Blazing-fast, cross-platform node access to ExifTool.
data:image/s3,"s3://crabby-images/2aea6/2aea6b79ccdb75f471afad86eb23506fedd670a5" alt="Build status"
Features
-
High performance via -stay_open
and multithreading. 7-300x faster than competing packages
-
Support for Mac, Linux, and Windows.
-
Proper extraction of
-
Robust type definitions of the top 99.5% of tags used by over 3,000 different camera makes and models
-
Auditable ExifTool source code (the "vendored" code is verifiable)
-
Automated updates to ExifTool (as new versions come out monthly)
-
Robust test suite, performed with node v6+ on Linux, Mac, & Windows.
Installation
npm install --save exiftool-vendored
The vendored version of ExifTool relevant for your platform will be installed via platform-dependent-modules. You shouldn't include either the exiftool-vendored.exe
or exiftool-vendored.pl
as direct dependencies to your project.
Usage
import { exiftool, Tags } from "exiftool-vendored"
exiftool.read("path/to/file.jpg").then((tags ) => {
console.log(`Make: ${metadata.Make}, Model: ${metadata.Model}`)
})
Performance
With the npm run mktags
target, > 3000 sample images, and maxProcs
set to 4, reading tags on my laptop takes ~6 ms per image:
Read 2236 unique tags from 3011 files.
Parsing took 16191ms (5.4ms / file)
Parsing took 27141ms (9.0ms / file)
Parsing took 12545ms (4.2ms / file)
For reference, using the exiftool
npm package (which doesn't work on Windows) took 85 seconds (almost 7x the time):
Reading 3011 files...
Parsing took 85654ms (28.4ms / file)
This package is so much faster due to ExifTool
child process reuse, as well as delegation to > 1 child processes.
stay_open
Starting the perl version of ExifTool is expensive, and is especially expensive on the Windows version of ExifTool.
On Windows, a distribution of Perl and the ~1000 files that make up ExifTool are extracted into a temporary directory for every invocation. Windows virus scanners that wedge reads on these files until they've been determined to be safe make this approach even more costly.
Using -stay_open
we can reuse a single instance of ExifTool across all requests, which drops response latency dramatically.
Parallelism
The exiftool
singleton is configured with a maxProcs
of 1;
no more than 1 child process of ExifTool will be spawned, even if there are many read requests outstanding.
If you want higher throughput, instantiate your own singleton reference of ExifTool
with a higher maxProcs. Note that exceeding your cpu count won't increase throughput, and that each child process consumes between 10 and 50 MB of RAM.
You may want to call .end()
on your singleton reference when your script terminates. This gracefully shuts down all child processes.
Dates
Generally, EXIF tags encode dates and times with no timezone offset. Presumably the time is captured in local time, but this means parsing the same file in different parts of the world results in a different absolute timestamp for the same file.
Rather than returning a Date which always includes a timezone, this library returns classes that encode the date, the time of day, or both, with an optional tzoffset. It's up to you, then, to do what's right.
In many cases, though, a tzoffset can be determined, either by the composite TimeZone
tag, or by looking at the difference between the local DateTimeOriginal
and GPSDateTime
tags. GPSDateTime
is present in most smartphone images.
If a tzoffset can be determined, it is encoded in all related ExifDateTime
tags for those files.
Tags
Official EXIF tag names are PascalCased, like AFPointSelected
and ISO
. ("Fixing" the field names to be camelCase, would result in ungainly aFPointSelected
and iSO
atrocities).
The tags.ts
file is autogenerated by parsing through images of more than 3,000 different camera makes and models taken from the ExifTool site. It groups tags, their type, frequency, and example values such that your IDE can autocomplete.
Note that tag existence and types is not guaranteed. If parsing fails (for, example, and datetime string), the raw string will be returned. Consuming code should verify both existence and type as reasonable for safety.
Versioning
I wanted to include the ExifTool's version number explicitly in the version number, but npm requires strict compliance with SemVer. Given that ExifTool sometimes includes patch releases, there aren't always enough spots to encode an API version and the ExifTool version.
Given those constraints, version numbers follow the following scheme:
$API.$UPDATE.$PATCH
- Breaking API changes to this package will increment
API
. - Any bugfix or new release of ExifTool will increment
UPDATE
. - Metadata changes or trivial bugfixes will increment
PATCH
.
Note that the platform dependent modules use the ExifTool version with an optional patch release.
v1.1.0
- Added
toString()
for all date/time types
v1.0.0
- Added typings reference in the package.json
- Upgraded vendored exiftool to 10.33
v0.4.0
- Fixed packaging (maintain jsdocs in .d.ts)
- Using np for packaging
v0.3.0
- Multithreading support with the
maxProcs
ctor param - Added tests for reading images with truncated or missing EXIF headers
- Added tests for timezone offset extraction and rendering
- Subsecond resolution from the Google Pixel has 8 significant digits(!!), added support for that.
v0.2.0
- More rigorous TimeZone extraction from assets, and added the
ExifTimeZoneOffset
to handle the TimeZone
composite tag - Added support for millisecond timestamps
v0.1.1
Initial Release. Packages ExifTool v10.31.