Streaming Shapefile Parser
In Node:
var shapefile = require("shapefile");
shapefile.open("example.shp")
.then(source => source.read()
.then(function log(result) {
if (result.done) return;
console.log(result.value);
return source.read().then(log);
}))
.catch(error => console.error(error.stack));
In a browser:
<!DOCTYPE html>
<script src="https://unpkg.com/shapefile@0.5"></script>
<script>
shapefile.open("https://cdn.rawgit.com/mbostock/shapefile/master/test/points.shp")
.then(source => source.read()
.then(function log(result) {
if (result.done) return;
console.log(result.value);
return source.read().then(log);
}))
.catch(error => console.error(error.stack));
</script>
In a terminal:
shp2json example.shp
For a live example, see bl.ocks.org/2dd741099154a4da55a7db31fd96a892. See also ndjson-cli for examples of manipulating GeoJSON using newline-delimited JSON streams.
This parser implementation is based on the ESRI Shapefile Technical Description and dBASE Table File Format. Caveat emptor: this is a work in progress and does not currently support all shapefile geometry types. It only supports dBASE III and has little error checking. Please contribute if you want to help!
In-browser parsing of dBASE table files requires TextDecoder, part of the Encoding living standard, which is not supported in IE or Safari as of September, 2016. See text-encoding for a browser polyfill.
TypeScript definitions are available in DefinitelyTyped: typings install dt~shapefile
.
API Reference
# shapefile.open(shp[, dbf[, options]]) <>
Returns a promise that yields an open shapefile source.
If typeof shp is “string”, opens the shapefile at the specified shp path. If shp does not have a “.shp” extension, it is implicitly added. If shp instanceof ArrayBuffer or shp instanceof Uint8Array, reads the specified in-memory shapefile. Otherwise, shp must be a Node readable stream in Node or a WhatWG standard readable stream in browsers.
If typeof dbf is “string”, opens the dBASE file at the specified dbf path. If dbf does not have a “.dbf” extension, it is implicitly added. If dbf instanceof ArrayBuffer or dbf instanceof Uint8Array, reads the specified in-memory dBASE file. If dbf is undefined and shp is a string, then dbf defaults to shp with the “.shp” extension replaced with “.dbf”; in this case, no error is thrown if there is no dBASE file at the resulting implied dbf. If dbf is undefined and shp is not a string, or if dbf is null, then no dBASE file is read, and the resulting GeoJSON features will have empty properties. Otherwise, dbf must be a Node readable stream in Node or a WhatWG standard readable stream in browsers.
If typeof shp or dbf is “string”, in Node, the files are read from the file system; in browsers, the files are read using streaming fetch, if available, and falling back to XMLHttpRequest. See path-source for more.
The follwing options are supported:
encoding
- the dBASE character encoding; defaults to “windows-1252”highWaterMark
- in Node, the size of the stream’s internal buffer; defaults to 65536
# shapefile.read(shp[, dbf[, options]]) <>
Returns a promise that yields a GeoJSON feature collection for specified shapefile shp and dBASE table file dbf. The meaning of the arguments is the same as shapefile.open. This is a convenience API for reading an entire shapefile in one go; use this method if you don’t mind putting the whole shapefile in memory. The yielded collection has a bbox property representing the bounding box of all records in this shapefile. The bounding box is specified as [xmin, ymin, xmax, ymax], where x and y represent longitude and latitude in spherical coordinates.
The coordinate reference system of the feature collection is not specified. This library does not support parsing coordinate reference system specifications (.prj). Proj4js can parse most well-known text (WKT) specifications, but I’m not aware of any pure-JavaScript libraries that can convert these to OGC CRS URNs. (Please let me know if one exists!)
# source.bbox
The shapefile’s bounding box [xmin, ymin, xmax, ymax], where x and y represent longitude and latitude in spherical coordinates.
# source.read() <>
Returns a Promise for the next record from the underlying stream. The yielded result is an object with the following properties:
value
- a GeoJSON feature, or undefined if the stream endeddone
- a boolean which is true if the stream ended
# source.cancel() <>
Returns a Promise which is resolved when the underlying stream has been destroyed.
Command Line Reference
# shp2json [options…] [file] <>
Converts the specified shapefile file to GeoJSON. If file is not specified, defaults to reading from stdin (with no dBASE file). For example, to convert to a feature collection:
shp2json example.shp
To convert to a geometry collection:
shp2json -g example.shp
To convert to newline-delimited features:
shp2json -n example.shp
To convert to newline-delimited geometries:
shp2json -ng example.shp
When --geometry or --ignore-properties is not used, the shapefile is joined to the dBASE table file (.dbf) file corresonding to the specified shapefile file, if any.
# shp2json -h
# shp2json --help
Output usage information.
# shp2json -V
# shp2json --version
Output the version number.
# shp2json -o file
# shp2json --out file
Specify the output file name. Defaults to “-” for stdout.
# shp2json -n
# shp2json --newline-delimited
Output newline-delimited JSON, with one feature or geometry per line.
# shp2json -g
# shp2json --geometry
Output a geometry collection instead of a feature collection or, in conjuction with --newline-delimited, geometry objects instead of feature objects. Implies --ignore-properties.
# shp2json --ignore-properties
Ignore the corresponding dBASE table file (.dbf), if any. Output features will have an empty properties object.
# shp2json --encoding encoding
Specify the dBASE table file character encoding. Defaults to “windows-1252”.
# shp2json --crs-name name
Specify the coordinate reference system name. This only applies when generating a feature collection; it is ignored when -n or -g is used. Per the GeoJSON specification, the name should be an OGC CRS URN such as urn:ogc:def:crs:OGC:1.3:CRS84
. However, legacy identifiers such as EPSG:4326
may also be used.
This does not convert between coordinate reference systems! It merely outputs coordinate reference system metadata. This library does not support parsing coordinate reference system specifications (.prj).