Streaming Shapefile Parser
In Node:
var shapefile = require("shapefile");
shapefile.open("example.shp")
.then(source => source.read()
.then(function log(result) {
if (result.done) return;
console.log(result.value);
return source.read().then(log);
}))
.catch(error => console.error(error.stack));
In a browser:
<!DOCTYPE html>
<script src="https://unpkg.com/shapefile@0.6"></script>
<script>
shapefile.open("https://cdn.rawgit.com/mbostock/shapefile/master/test/points.shp")
.then(source => source.read()
.then(function log(result) {
if (result.done) return;
console.log(result.value);
return source.read().then(log);
}))
.catch(error => console.error(error.stack));
</script>
In a terminal:
shp2json example.shp
For a live example, see bl.ocks.org/2dd741099154a4da55a7db31fd96a892. See also ndjson-cli for examples of manipulating GeoJSON using newline-delimited JSON streams. See Command-Line Cartography for a longer introduction.
This parser implementation is based on the ESRI Shapefile Technical Description, dBASE Table for ESRI Shapefile (DBF) and Data File Header Structure for the
dBASE Version 7 Table File. Caveat emptor: this is a work in progress and does not currently support all shapefile geometry types. It only supports dBASE III and has little error checking. Please contribute if you want to help!
In-browser parsing of dBASE table files requires TextDecoder, part of the Encoding living standard, which is not supported in IE or Safari as of September, 2016. See text-encoding for a browser polyfill.
TypeScript definitions are available in DefinitelyTyped: typings install dt~shapefile
.
API Reference
# shapefile.read(shp[, dbf[, options]]) <>
Returns a promise that yields a GeoJSON feature collection for specified shapefile shp and dBASE table file dbf. The meaning of the arguments is the same as shapefile.open. This is a convenience API for reading an entire shapefile in one go; use this method if you don’t mind putting the whole shapefile in memory. The yielded collection has a bbox property representing the bounding box of all records in this shapefile. The bounding box is specified as [xmin, ymin, xmax, ymax], where x and y represent longitude and latitude in spherical coordinates.
The coordinate reference system of the feature collection is not specified. This library does not support parsing coordinate reference system specifications (.prj); see Proj4js for parsing well-known text (WKT) specifications.
# shapefile.open(shp[, dbf[, options]]) <>
Returns a promise that yields a GeoJSON Feature source.
If typeof shp is “string”, opens the shapefile at the specified shp path. If shp does not have a “.shp” extension, it is implicitly added. If shp instanceof ArrayBuffer or shp instanceof Uint8Array, reads the specified in-memory shapefile. Otherwise, shp must be a Node readable stream in Node or a WhatWG standard readable stream in browsers.
If typeof dbf is “string”, opens the dBASE file at the specified dbf path. If dbf does not have a “.dbf” extension, it is implicitly added. If dbf instanceof ArrayBuffer or dbf instanceof Uint8Array, reads the specified in-memory dBASE file. If dbf is undefined and shp is a string, then dbf defaults to shp with the “.shp” extension replaced with “.dbf”; in this case, no error is thrown if there is no dBASE file at the resulting implied dbf. If dbf is undefined and shp is not a string, or if dbf is null, then no dBASE file is read, and the resulting GeoJSON features will have empty properties. Otherwise, dbf must be a Node readable stream in Node or a WhatWG standard readable stream in browsers.
If typeof shp or dbf is “string”, in Node, the files are read from the file system; in browsers, the files are read using streaming fetch, if available, and falling back to XMLHttpRequest. See path-source for more.
The follwing options are supported:
encoding
- the dBASE character encoding; defaults to “windows-1252”highWaterMark
- in Node, the size of the stream’s internal buffer; defaults to 65536
# shapefile.openShp(shp[, options]) <>
Returns a promise that yields a GeoJSON geometry source. Unlike shapefile.open, this only reads the shapefile, and never the associated dBASE file. Subsequent calls to source.read will yield GeoJSON geometries.
If typeof shp is “string”, opens the shapefile at the specified shp path. If shp does not have a “.shp” extension, it is implicitly added. In Node, the files are read from the file system; in browsers, the files are read using streaming fetch, if available, and falling back to XMLHttpRequest. (See path-source for more.) If shp instanceof ArrayBuffer or shp instanceof Uint8Array, reads the specified in-memory shapefile. Otherwise, shp must be a Node readable stream in Node or a WhatWG standard readable stream in browsers.
The follwing options are supported:
highWaterMark
- in Node, the size of the stream’s internal buffer; defaults to 65536
# shapefile.openDbf(dbf[, options]) <>
Returns a promise that yields a GeoJSON properties object source. Unlike shapefile.open, this only reads the dBASE file, and never the associated shapefile. Subsequent calls to source.read will yield GeoJSON properties objects.
If typeof dbf is “string”, opens the dBASE at the specified dbf path. If dbf does not have a “.dbf” extension, it is implicitly added. In Node, the files are read from the file system; in browsers, the files are read using streaming fetch, if available, and falling back to XMLHttpRequest. (See path-source for more.) If dbf instanceof ArrayBuffer or dbf instanceof Uint8Array, reads the specified in-memory shapefile. Otherwise, dbf must be a Node readable stream in Node or a WhatWG standard readable stream in browsers.
The follwing options are supported:
encoding
- the dBASE character encoding; defaults to “windows-1252”highWaterMark
- in Node, the size of the stream’s internal buffer; defaults to 65536
Sources
Calling shapefile.open yields a source; you can then call source.read to read individual GeoJSON features. Similarly, shapefile.openShp yields a source of GeoJSON geometries, and shapefile.openDbf yields of a source of GeoJSON properties objects.
# source.bbox
The shapefile’s bounding box [xmin, ymin, xmax, ymax], where x and y represent longitude and latitude in spherical coordinates. This field is only defined on sources returned by shapefile.open and shapefile.openShp, not shapefile.openDbf.
# source.read() <>
Returns a Promise for the next record from the underlying stream. The yielded result is an object with the following properties:
value
- a JSON object, or undefined if the stream endeddone
- a boolean which is true if the stream ended
The type of JSON object depends on the type of source: it may be either a GeoJSON feature, a GeoJSON geometry, or a GeoJSON properties object (any JSON object).
# source.cancel() <>
Returns a Promise which is resolved when the underlying stream has been destroyed.
Command Line Reference
shp2json
# shp2json [options…] [file] <>
Converts the specified shapefile file to GeoJSON. If file is not specified, defaults to reading from stdin (with no dBASE file). For example, to convert to a feature collection:
shp2json example.shp
To convert to a geometry collection:
shp2json -g example.shp
To convert to newline-delimited features:
shp2json -n example.shp
To convert to newline-delimited geometries:
shp2json -ng example.shp
When --geometry or --ignore-properties is not used, the shapefile is joined to the dBASE table file (.dbf) file corresonding to the specified shapefile file, if any.
# shp2json -h
# shp2json --help
Output usage information.
# shp2json -V
# shp2json --version
Output the version number.
# shp2json -o file
# shp2json --out file
Specify the output file name. Defaults to “-” for stdout.
# shp2json -n
# shp2json --newline-delimited
Output newline-delimited JSON, with one feature or geometry per line.
# shp2json -g
# shp2json --geometry
Output a geometry collection instead of a feature collection or, in conjuction with --newline-delimited, geometries instead of feature objects. Implies --ignore-properties.
# shp2json --ignore-properties
Ignore the corresponding dBASE table file (.dbf), if any. Output features will have an empty properties object.
# shp2json --encoding encoding
Specify the dBASE table file character encoding. Defaults to “windows-1252”.
# shp2json --crs-name name
Specify the coordinate reference system name. This only applies when generating a feature collection; it is ignored when -n or -g is used. Per the GeoJSON specification, the name should be an OGC CRS URN such as urn:ogc:def:crs:OGC:1.3:CRS84
. However, legacy identifiers such as EPSG:4326
may also be used.
This does not convert between coordinate reference systems! It merely outputs coordinate reference system metadata. This library does not support parsing coordinate reference system specifications (.prj).
dbf2json
# dbf2json [options…] [file] <>
Converts the specified dBASE file to JSON. If file is not specified, defaults to reading from stdin. For example:
dbf2json example.dbf
To convert to newline-delimited objects:
dbf2json -n example.dbf
# dbf2json -h
# dbf2json --help
Output usage information.
# dbf2json -V
# dbf2json --version
Output the version number.
# dbf2json -o file
# dbf2json --out file
Specify the output file name. Defaults to “-” for stdout.
# dbf2json -n
# dbf2json --newline-delimited
Output newline-delimited JSON, with one object per line.
# dbf2json --encoding encoding
Specify the input character encoding. Defaults to “windows-1252”.