csvgeocode
Advanced tools
Comparing version
{ | ||
"name": "csvgeocode", | ||
"version": "1.0.0", | ||
"version": "2.0.0", | ||
"description": "Bulk geocode addresses in a CSV.", | ||
@@ -32,5 +32,5 @@ "main": "index.js", | ||
"dependencies": { | ||
"csv-parse": "0.0.6", | ||
"csv-stringify": "0.0.6", | ||
"dsv": "0.0.4", | ||
"extend": "^2.0.0", | ||
"mustache": "^1.1.0", | ||
"queue-async": "^1.0.7", | ||
@@ -37,0 +37,0 @@ "request": "^2.53.0", |
108
README.md
@@ -6,4 +6,6 @@ csvgeocode | ||
The defaults are configured to use [Google's geocoder](https://developers.google.com/maps/documentation/geocoding/) but it can be configured to work with any other similar geocoding service. Make sure that you use this in compliance with the relevant API's terms of service (Google's terms are [here](https://developers.google.com/maps/terms) and require any geocoding to be in conjunction with display on a Google Map). | ||
The defaults are configured for [Google's geocoder](https://developers.google.com/maps/documentation/geocoding/) but it can be configured to work with any other similar geocoding service. There are built-in response handlers for [Google](https://developers.google.com/maps/documentation/geocoding/), [Mapbox](https://www.mapbox.com/developers/api/geocoding/), and [Texas A & M's](http://geoservices.tamu.edu/Services/Geocode/WebService/) geocoders (details below). | ||
Make sure that you use this in compliance with the relevant API's terms of service. | ||
## Basic command line usage | ||
@@ -20,3 +22,3 @@ | ||
``` | ||
$ csvgeocode path/to/input.csv path/to/output.csv | ||
$ csvgeocode path/to/input.csv path/to/output.csv --url "https://maps.googleapis.com/maps/api/geocode/json?address={{MY_ADDRESS_COLUMN_NAME}}&key=MY_API_KEY" | ||
``` | ||
@@ -27,3 +29,3 @@ | ||
``` | ||
$ csvgeocode path/to/input.csv | grep "greppin for somethin" | ||
$ csvgeocode path/to/input.csv [options] | grep "greppin for somethin" | ||
``` | ||
@@ -36,25 +38,32 @@ | ||
``` | ||
$ csvgeocode input.csv output.csv --address MY_ADDRESS_COLUMN_HAS_THIS_WEIRD_NAME --delay 1000 --handler "mapbox" --key MY_API_KEY --verbose | ||
$ csvgeocode input.csv output.csv --url "http://someurl.com/" --lat CALL_MY_LATITUDE_COLUMN_THIS_SPECIAL_NAME --delay 1000 --verbose | ||
``` | ||
None of the options are required. | ||
The only required option is `url`. All others are optional. | ||
#### `--handler [handler]` | ||
#### `--url [url]` (REQUIRED) | ||
What API handler to use. Current built-in handlers are "google" and "mapbox". Contributions of handlers for other geocoders welcome! You can define a custom handler when using this as a Node module (see below). | ||
A URL template with column names as [Mustache tags](http://mustache.github.io/), like: | ||
**Default:** `"google"` | ||
``` | ||
http://api.tiles.mapbox.com/v4/geocode/mapbox.places/{{address}}.json?access_token=MY_API_KEY | ||
#### `--key [key]` | ||
https://maps.googleapis.com/maps/api/geocode/json?address={{address}}&key=MY_API_KEY | ||
The API key to use with requests. | ||
http://geoservices.tamu.edu/Services/Geocode/WebService/GeocoderWebServiceHttpNonParsed_V04_01.aspx?apiKey=MY_API_KEY&version=4.01&streetAddress={{address}}&city={{city}}&state={{state}} | ||
``` | ||
**Default:** none | ||
#### `--handler [handler]` | ||
#### `--address [address column name]` | ||
What handler function to process the API response with. Current built-in handlers are `"google"`, `"mapbox"`, and `"tamu"`. Contributions of handlers for other geocoders are welcome! You can define a custom handler when using this as a Node module (see below). | ||
The name of the column that contains the address to geocode. | ||
Examples: | ||
``` | ||
$ csvgeocode input.csv --url "http://api.tiles.mapbox.com/v4/geocode/mapbox.places/{{MY_ADDRESS_COLUMN_NAME}}.json?access_token=123ABC" --handler mapbox | ||
**Default:** Tries to automatically detect if there's a relevant column name in the input CSV, like `address` or `street_address`. | ||
$ csvgeocode input.csv --url "http://geoservices.tamu.edu/Services/Geocode/WebService/GeocoderWebServiceHttpNonParsed_V04_01.aspx?version=4.01&streetAddress={{ADDR}}&city={{CITY}}&state={{STATE}}&apiKey=123ABC" --handler tamu | ||
``` | ||
**Default:** `"google"` | ||
#### `--lat [latitude column name]` | ||
@@ -64,3 +73,3 @@ | ||
**Default:** Tries to automatically detect if there is a relevant column name in the input CSV, like `lat` or `latitude`. If none is found, it will add the new column `lat` to the output. | ||
**Default:** Tries to automatically detect if there is a relevant existing column name in the input CSV, like `lat` or `latitude`. If none is found, it will use `lat`. | ||
@@ -71,3 +80,3 @@ #### `--lng [longitude column name]` | ||
**Default:** Tries to automatically detect if there is a relevant column name in the input CSV, like `lng` or `longitude`. If none is found, it will add the new column `lng` to the output. | ||
**Default:** Tries to automatically detect if there is a relevant existing column name in the input CSV, like `lng` or `longitude`. If none is found, it will use `lng`. | ||
@@ -89,8 +98,12 @@ #### `--delay [milliseconds]` | ||
``` | ||
$ csvgeocode input.csv --verbose | ||
... | ||
160 Varick St, New York NY: SUCCESS | ||
1600 Pennsylvania Ave, Washington, DC: SUCCESS | ||
123 Fictional St: NO MATCH | ||
$ csvgeocode input.csv --url "MY_API_URL" --verbose | ||
160 Varick St,New York,NY | ||
SUCCESS | ||
1600 Pennsylvania Ave,Washington,DC | ||
SUCCESS | ||
123 Fictional St,Noncity,XY | ||
NO MATCH | ||
Rows geocoded: 2 | ||
@@ -115,6 +128,10 @@ Rows failed: 1 | ||
//stream to stdout | ||
csvgeocode("path/to/input.csv"); | ||
csvgeocode("path/to/input.csv",{ | ||
url: "MY_API_URL" | ||
}); | ||
//write to a file | ||
csvgeocode("path/to/input.csv","path/to/output.csv"); | ||
csvgeocode("path/to/input.csv","path/to/output.csv",{ | ||
url: "MY_API_URL" | ||
}); | ||
``` | ||
@@ -126,3 +143,3 @@ | ||
var options = { | ||
"address": "MY_SPECIAL_ADDRESS_COLUMN_NAME", | ||
"url": "MY_API_URL", | ||
"lat": "MY_SPECIAL_LATITUDE_COLUMN_NAME", | ||
@@ -147,3 +164,3 @@ "lng": "MY_SPECIAL_LONGITUDE_COLUMN_NAME", | ||
```js | ||
csvgeocode("input.csv") | ||
csvgeocode("input.csv",options) | ||
.on("row",function(err,row){ | ||
@@ -170,3 +187,3 @@ if (err) { | ||
```js | ||
csvgeocoder("input.csv") | ||
csvgeocoder("input.csv",options) | ||
.on("complete",function(summary){ | ||
@@ -190,24 +207,25 @@ /* | ||
A handler needs a `url` function and a `process` function, like: | ||
The handler function is passed the body of an API response and should either return a string error message or an object with `lat` and `lng` properties. | ||
```js | ||
var customHandler = { | ||
url: function(address,options) { | ||
return "http://mygeocoder.com/?address="+encodeURIComponent(address)+"&api_key="+options.key; | ||
}, | ||
process: function(body) { | ||
if (body.results) { | ||
return body.results[0]; | ||
} else { | ||
return "NO MATCH"; | ||
} | ||
csvgeocoder("input.csv",{ | ||
url: "MY_API_URL", | ||
handler: customHandler | ||
}); | ||
function customHandler(body) { | ||
//success, return a lat/lng | ||
if (body.result) { | ||
return { | ||
lat: body.result.lat, | ||
lng: body.result.lng | ||
}; | ||
} | ||
//failure, return a string | ||
return "NO MATCH"; | ||
} | ||
``` | ||
The `url` function will get passed the address being geocoded and the current options and should return the URL to request for that address. | ||
The `process` function will be passed the body of the geocoder response. It should return a string error message if there's no lat/lng to use, or it should return an object with `lat` and `lng` properties. | ||
## Some Alternatives | ||
@@ -221,6 +239,4 @@ | ||
* Add the NYC and TAMU geocoders as built-in handlers. | ||
* Support a CSV with no header row where `lat`, `lng`, and `address` are numerical indices instead of column names. | ||
* Allow `address` to be an array of multiple fields that get concatenated (e.g. `["street","city","state","zip"]`) | ||
* Support the `handler` option for CLI too? | ||
* Add the NYC geocoder as a built-in handler. | ||
* Support a CSV with no header row where `lat` and `lng` are numerical indices instead of column names. | ||
* Support both POST and GET requests somehow. | ||
@@ -227,0 +243,0 @@ |
var fs = require("fs"), | ||
parse = require("csv-parse"), | ||
stringify = require("csv-stringify"); | ||
csv = require("dsv")(","); | ||
@@ -13,3 +12,3 @@ module.exports = { | ||
_parse(raw,cb); | ||
cb(csv.parse(raw)); | ||
@@ -19,37 +18,11 @@ }); | ||
write: function(filename,rows,cb) { | ||
_stringify(rows,function(string){ | ||
fs.writeFile(filename,string,function(err){ | ||
if (err) { | ||
throw new Error(err); | ||
}; | ||
cb(); | ||
}); | ||
fs.writeFile(filename,csv.format(rows),function(err){ | ||
if (err) { | ||
throw new Error(err); | ||
}; | ||
cb(); | ||
}); | ||
}, | ||
parse: _parse, | ||
stringify: _stringify | ||
}; | ||
function _parse(raw,cb) { | ||
parse(raw,{columns:true},function(err,parsed){ | ||
if (err) { | ||
throw new Error(err); | ||
} | ||
cb(parsed); | ||
}); | ||
} | ||
function _stringify(rows,cb) { | ||
stringify(rows,{header:true},function(err,string){ | ||
if (err) { | ||
throw new Error(err); | ||
} | ||
cb(string); | ||
}); | ||
} | ||
parse: csv.parse, | ||
stringify: csv.format | ||
}; |
@@ -9,2 +9,3 @@ var misc = require("./misc"), | ||
util = require("util"), | ||
render = require("mustache").render, | ||
csv = require("./csv"), | ||
@@ -42,3 +43,3 @@ EventEmitter = require("events").EventEmitter; | ||
} | ||
} else if (typeof options.handler.process !== "function" || typeof options.handler.url !== "function") { | ||
} else if (typeof options.handler !== "function") { | ||
throw new TypeError("Invalid value for 'handler' option. Must be the name of a built-in handler or a custom handler."); | ||
@@ -51,2 +52,6 @@ } | ||
if (typeof options.url !== "string") { | ||
throw new Error("'url' parameter is required."); | ||
} | ||
var geocoder = new Geocoder(); | ||
@@ -81,10 +86,6 @@ | ||
//try to discover them on the first data row | ||
if (options.lat === null || options.lng === null || options.address === null) { | ||
if (options.lat === null || options.lng === null) { | ||
options = misc.discoverOptions(options,parsed[0]); | ||
if (options.address === null) { | ||
throw new Error("Couldn't auto-detect address column."); | ||
} | ||
} | ||
@@ -102,5 +103,3 @@ | ||
if (row[options.address] === undefined) { | ||
throw new Error("Couldn't find address column '"+options.address+"'"); | ||
} | ||
var url = render(options.url,escape(row)); | ||
@@ -114,6 +113,6 @@ //Doesn't need geocoding | ||
//Address is cached from a previous result | ||
if (cache[row[options.address]]) { | ||
if (cache[url]) { | ||
row[options.lat] = cache[row[options.address]].lat; | ||
row[options.lng] = cache[row[options.address]].lng; | ||
row[options.lat] = cache[url].lat; | ||
row[options.lng] = cache[url].lng; | ||
@@ -125,3 +124,3 @@ _this.emit("row",null,row); | ||
request.get(options.handler.url(row[options.address],options),function(err,response,body) { | ||
request.get(url,function(err,response,body) { | ||
@@ -141,3 +140,3 @@ //Some other error | ||
handleResponse(body,row,cb); | ||
handleResponse(body,row,url,cb); | ||
@@ -151,3 +150,3 @@ } | ||
function handleResponse(body,row,cb) { | ||
function handleResponse(body,row,url,cb) { | ||
@@ -157,3 +156,3 @@ var result; | ||
try { | ||
result = options.handler.process(body); | ||
result = options.handler(body); | ||
} catch (e) { | ||
@@ -178,3 +177,3 @@ _this.emit("row","Parsing error: "+e.toString(),row); | ||
//Cache the result | ||
cache[row[options.address]] = result; | ||
cache[url] = result; | ||
_this.emit("row",null,row); | ||
@@ -244,3 +243,13 @@ | ||
function escape(row) { | ||
var escaped = extend({},row); | ||
for (var key in escaped) { | ||
escaped[key] = escaped[key].replace(/ /g,"+").replace(/[&]/g,"%26") | ||
} | ||
return escaped; | ||
} | ||
}; | ||
module.exports = { | ||
lat: null, | ||
lng: null, | ||
address: null, | ||
key: "", | ||
delay: 250, | ||
@@ -7,0 +5,0 @@ force: false, |
@@ -0,51 +1,62 @@ | ||
var csv = require("dsv")(","); | ||
module.exports = { | ||
google: { | ||
url: function(address,options) { | ||
return "https://maps.googleapis.com/maps/api/geocode/json?address=" | ||
+ address.replace(/ /g,"+").replace(/[&]/g,"%26") | ||
+ (options.key ? "&key=" + options.key : ""); | ||
}, | ||
process: function(body) { | ||
google: function(body) { | ||
var response = JSON.parse(body); | ||
var response = JSON.parse(body); | ||
//Success, return a lat/lng object | ||
if (response.results && response.results.length) { | ||
return response.results[0].geometry.location; | ||
} | ||
//Success, return a lat/lng object | ||
if (response.results && response.results.length) { | ||
return response.results[0].geometry.location; | ||
} | ||
//No match, return a string | ||
if (response.status === "ZERO_RESULTS" || response.status === "OK") { | ||
return "NO MATCH"; | ||
} | ||
//No match, return a string | ||
if (response.status === "ZERO_RESULTS" || response.status === "OK") { | ||
return "NO MATCH"; | ||
} | ||
//Other error, return a string | ||
return response.status; | ||
//Other error, return a string | ||
return response.status; | ||
}, | ||
mapbox: function(body) { | ||
var response = JSON.parse(body); | ||
if (response.features === undefined) { | ||
return response.message; | ||
} else if (!response.features.length) { | ||
return "NO MATCH"; | ||
} | ||
return { | ||
lat: response.features[0].center[1], | ||
lng: response.features[0].center[0] | ||
}; | ||
}, | ||
mapbox: { | ||
url: function(address,options){ | ||
return "http://api.tiles.mapbox.com/v4/geocode/mapbox.places/" | ||
+ address.replace(/ /g,"+").replace(/[&]/g,"%26") | ||
+ ".json?access_token=" | ||
+ (options.key || ""); | ||
}, | ||
process: function(body) { | ||
tamu: function(body) { | ||
var response = JSON.parse(body); | ||
var parsed; | ||
if (response.features === undefined) { | ||
return response.message; | ||
} else if (!response.features.length) { | ||
return "NO MATCH"; | ||
} | ||
try { | ||
parsed = csv.parseRows(body); | ||
} catch(e) { | ||
return "ERROR PARSING RESPONSE: "+body; | ||
} | ||
return { | ||
lat: response.features[0].center[1], | ||
lng: response.features[0].center[0] | ||
}; | ||
if (parsed[0].length < 5) { | ||
return "UNEXPECTED RESPONSE FORMAT FROM TAMU GEOCODER: "+csv.formatRows([parsed[0]]); | ||
} | ||
} | ||
if (!parsed.length || +parsed[0][2] !== 200 || !+parsed[0][3] || !+parsed[0][4]) { | ||
return "NO MATCH"; | ||
} | ||
return { | ||
lat: parsed[0][3], | ||
lng: parsed[0][4] | ||
} | ||
} | ||
}; |
@@ -18,6 +18,2 @@ module.exports = { | ||
} | ||
if (options.address === null && key.trim().match(/^(street[^a-z]*)?addr(ess)?$/i)) { | ||
options.address = key; | ||
continue; | ||
} | ||
} | ||
@@ -24,0 +20,0 @@ |
@@ -7,2 +7,3 @@ var assert = require("assert"), | ||
.defer(basicTest) | ||
.defer(requiredTest) | ||
.defer(cacheTest) | ||
@@ -12,4 +13,5 @@ .defer(columnNamesTest) | ||
.defer(handlerTest) | ||
.defer(keyTest) | ||
.defer(mapboxTest) | ||
.defer(throwTest) | ||
.defer(tamuTest) | ||
.awaitAll(function(){}); | ||
@@ -19,3 +21,4 @@ | ||
geocode("test/basic.csv",{ | ||
test: true | ||
test: true, | ||
url: process.env.TEST_URL | ||
}) | ||
@@ -47,6 +50,28 @@ .on("row",function(err,row){ | ||
function requiredTest(cb) { | ||
assert.throws( | ||
function(){ | ||
geocode("test/basic.csv",{ | ||
test: true | ||
}); | ||
}, | ||
function(err) { | ||
if (err instanceof Error && /url/i.test(err)) { | ||
return true; | ||
} | ||
}, | ||
"Expected required url message" | ||
); | ||
cb(null); | ||
} | ||
function cacheTest(cb) { | ||
geocode("test/basic.csv",{ | ||
force: true, | ||
test: true | ||
test: true, | ||
url: process.env.TEST_URL | ||
}) | ||
@@ -71,3 +96,4 @@ .on("row",function(err,row){ | ||
lng: "LANGITUDE", | ||
test: true | ||
test: true, | ||
url: process.env.TEST_URL | ||
}) | ||
@@ -92,3 +118,4 @@ .on("row",function(err,row){ | ||
geocode("test/column-names.csv",{ | ||
test: true | ||
test: true, | ||
url: process.env.TEST_URL | ||
}) | ||
@@ -98,5 +125,4 @@ .on("row",function(err,row){ | ||
assert.deepEqual(err,"NO MATCH","Expected NO MATCH in add columns test."); | ||
} else { | ||
assert("lat" in row && "lng" in row); | ||
} | ||
assert("lat" in row && "lng" in row); | ||
}) | ||
@@ -111,9 +137,5 @@ .on("complete",function(summary){ | ||
force: true, | ||
handler: { | ||
url: function(address,options){ | ||
return "http://www.yahoo.com/"; | ||
}, | ||
process: function(body) { | ||
url: process.env.TEST_URL, | ||
handler: function(body) { | ||
return "CUSTOM ERROR"; | ||
} | ||
}, | ||
@@ -137,2 +159,3 @@ test: true | ||
test: true, | ||
url: process.env.TEST_URL, | ||
handler: "dumb string" | ||
@@ -153,7 +176,7 @@ }); | ||
function mapboxTest(API_KEY,cb) { | ||
function mapboxTest(cb) { | ||
geocode("test/basic.csv",{ | ||
handler: mapboxHandler, | ||
url: "http://api.tiles.mapbox.com/v4/geocode/mapbox.places/{{a}}.json?access_token=" + API_KEY, | ||
handler: "mapbox", | ||
url: process.env.MAPBOX_TEST_URL, | ||
test: true | ||
@@ -169,2 +192,3 @@ }) | ||
.on("complete",function(summary){ | ||
assert.notDeepEqual(summary.successes,0,"Expected at least one success from Mapbox"); | ||
cb(null); | ||
@@ -175,6 +199,7 @@ }); | ||
function keyTest(cb) { | ||
function tamuTest(cb) { | ||
geocode("test/basic.csv",{ | ||
key: "INVALID_KEY", | ||
geocode("test/parts.csv",{ | ||
handler: "tamu", | ||
url: process.env.TAMU_TEST_URL, | ||
test: true | ||
@@ -184,3 +209,3 @@ }) | ||
if (err) { | ||
assert.deepEqual(err,"REQUEST_DENIED"); | ||
assert.deepEqual(err,"NO MATCH","Expected NO MATCH from tamuHandler."); | ||
} else { | ||
@@ -191,2 +216,4 @@ assert(row.lat && row.lng); | ||
.on("complete",function(summary){ | ||
assert.deepEqual(summary.successes,2,"Expected two successes from TAMU."); | ||
assert.deepEqual(summary.failures,1,"Expected one failure from TAMU."); | ||
cb(null); | ||
@@ -193,0 +220,0 @@ }); |
Sorry, the diff of this file is not supported yet
Sorry, the diff of this file is not supported yet
Environment variable access
Supply chain riskPackage accesses environment variables, which may be a sign of credential stuffing or data theft.
Found 3 instances in 1 package
24944
2.33%14
7.69%442
1.61%241
7.11%10
400%+ Added
+ Added
+ Added
+ Added
- Removed
- Removed
- Removed
- Removed