You're Invited:Meet the Socket Team at RSAC and BSidesSF 2026, March 23–26.RSVP
Socket
Book a DemoSign in
Socket

addfips

Package Overview
Dependencies
Maintainers
1
Versions
9
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

addfips - pypi Package Compare versions

Comparing version
0.2.2
to
0.3.0
+1
MANIFEST.in
include src/addfips/data/*.csv
Metadata-Version: 2.1
Name: addfips
Version: 0.3.0
Summary: Add county FIPS to tabular data
Home-page: http://github.com/fitnr/addfips
Author: Neil Freeman
Author-email: contact@fakeisthenewreal.org
License: GPL-3.0
Description: # AddFIPS
AddFIPS is a tool for adding state or county FIPS codes to files that contain just the names of those geographies.
FIPS codes are the official ID numbers of places in the US. They're invaluable for matching data from different sources.
Say you have a CSV file like this:
```
state,county,statistic
IL,Cook,123
California,Los Angeles County,321
New York,Kings,137
LA,Orleans,99
Alaska,Kusilvak,12
```
AddFIPS lets you do this:
```
> addfips --county-field=county input.csv
countyfp,state,county,statistic
17031,IL,Cook,123
06037,California,Los Angeles County,321
36047,New York,Kings,137
22071,LA,Orleans,99
02270,Alaska,Kusilvak,12
```
## Installing
AddFIPS is a Python package compatible with Python 3.
If you've used Python packages before:
```
pip install addfips
# or
pip install --user addfips
```
If you haven't used Python packages before, [get pip](http://pip.readthedocs.org/en/stable/installing/), then come back.
You can also clone the repo and install with `python setup.py install`.
## Features
* Use full names or postal abbrevations for states
* Works with all states, territories, and the District of Columbia
* Slightly fuzzy matching allows for missing diacretic marks and different name formats ("Nye County" or "Nye', "Saint Louis" or "St. Louis", "Prince George's" or "Prince Georges")
* Includes up-to-date 2015 geographies (shout out to Kusilvak Census Area, AK, and Oglala Lakota Co., SD)
Note that some states have counties and county-equivalent independent cities with the same names (e.g. Baltimore city & County, MD, Richmond city & County, VA). AddFIPS's behavior may pick the wrong geography if just the name ("Baltimore") is passed.
## Command line tool
````
usage: addfips [-h] [-V] [-d CHAR] (-s FIELD | -n NAME) [-c FIELD]
[-v VINTAGE] [--no-header]
[input]
AddFIPS codes to a CSV with state and/or county names
positional arguments:
input Input file. default: stdin
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-d CHAR, --delimiter CHAR
field delimiter. default: ,
-s FIELD, --state-field FIELD
Read state name or FIPS code from this field
-n NAME, --state-name NAME
Use this state for all rows
-c FIELD, --county-field FIELD
Read county name from this field. If blank, only state
FIPS code will be added
-v VINTAGE, --vintage VINTAGE
2000, 2010, or 2015. default: 2015
--no-header Input has no header now, interpret fields as integers
-u, --err-unmatched Print rows that addfips cannot match to stderr
````
Options and flags:
* `input`: (positional argument) The name of the file. If blank, `addfips` reads from stdin.
* `--delimiter`: Field delimiter, defaults to ','.
* `--state-field`: Name of the field containing state name
* `--state-name`: Name, postal abbreviation or state FIPS code to use for all rows.
* `--county-field`: Name of the field containing county name. If this is blank, the output will contain the two-character state FIPS code.
* `--vintage`: Use earlier county names and FIPS codes. For instance, Clifton Forge city, VA, is not included in 2010 or later vintages.
* `--no-header`: Indicates that the input file has no header. `--state-field` and `--county-field` are parsed as field indices.
* `--err-unmatched`: Rows that `addfips` cannot match will be printed to stderr, rather than stdout
The output is a CSV with a new column, "fips", appended to the front. When `addfips` cannot make a match, the fips column will have an empty value.
### Examples
Add state FIPS codes:
````
addfips data.csv --state-field fieldName > data_with_fips.csv
````
Add state and county FIPS codes:
````
addfips data.csv --state-field fieldName --county-field countyName > data_with_fips.csv
````
For files with no header row, use a number to refer to the columns with state and/or county names:
```
addfips --no-header-row --state-field 1 --county-field 2 data_no_header.csv > data_no_header_fips.csv
```
Column numbers are one-indexed.
AddFIPS for counties from a specific state. These are equivalent:
```
addfips ny_data.csv -c county --state-name NY > ny_data_fips.csv
addfips ny_data.csv -c county --state-name 'New York' > ny_data_fips.csv
addfips ny_data.csv -c county --state-name 36 > ny_data_fips.csv
```
Use an alternate delimiter:
```
addfips -d'|' -s state pipe_delimited.dsv > result.csv
addfips -d';' -s state semicolon_delimited.dsv > result.csv
```
Print unmatched rows to another file:
```
addfips --err-unmatched -s state state_data.csv > state_data_fips.csv 2> state_unmatched.csv
addfips -u -s STATE -c COUNTY county_data.csv > county_data_fips.csv 2> county_unmatched.csv
```
Pipe from other programs:
````
curl http://example.com/data.csv | addfips -s stateFieldName -c countyField > data_with_fips.csv
csvkit -c state,county,important huge_file.csv | addfips -s state -c county > small_file.csv
````
Pipe to other programs. In files with extensive text, filtering with the FIPS code is safer than using county names, which may be common words (e.g. cook):
````
addfips culinary_data.csv -s stateFieldName -c countyField | grep -e "^17031" > culinary_data_cook_county.csv
addfips -s StateName -c CountyName data.csv | csvsort -c fips > sorted_by_fips.csv
````
## API
AddFIPS is available for use in your Python scripts:
````python
>>> import addfips
>>> af = addfips.AddFIPS()
>>> af.get_state_fips('Puerto Rico')
'72'
>>> af.get_county_fips('Nye', state='Nevada')
'32023'
>>> row = {'county': 'Cook County', 'state': 'IL'}
>>> af.add_county_fips(row, county_field="county", state_field="state")
{'county': 'Cook County', 'state': 'IL', 'fips': '17031'}
````
The results of `AddFIPS.get_state_fips` and `AddFIPS.get_county_fips` are strings, since FIPS codes may have leading zeros.
### Classes
#### AddFIPS(vintage=None)
The AddFIPS class takes one keyword argument, `vintage`, which may be either `2000`, `2010` or `2015`. Any other value will use the most recent vintage. Other vintages may be added in the future.
__get_state_fips(self, state)__
Returns two-digit FIPS code based on a state name or postal code.
__get_county_fips(self, county, state)__
Returns five-digit FIPS code based on county name and state name/abbreviation/FIPS.
__add_state_fips(self, row, state_field='state')__
Returns the input row with a two-figit state FIPS code added.
Input row may be either a `dict` or a `list`. If a `dict`, the 'fips' key is added. If a `list`, the FIPS code is added at the start of the list.
__add_county_fips(self, row, county_field='county', state_field='state', state=None)__
Returns the input row with a five-figit county FIPS code added.
Input row may be either a `dict` or a `list`. If a `dict`, the 'fips' key is added. If a `list`, the FIPS code is added at the start of the list.
### License
Distributed under the GNU General Public License, version 3. See LICENSE for more information.
Keywords: csv census usa data
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Natural Language :: English
Classifier: Operating System :: Unix
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: Implementation :: PyPy
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6.0
Description-Content-Type: text/markdown
[tool.black]
line-length = 120
target-version = ["py38"]
skip-string-normalization = true
[tool.isort]
line_length = 120
[tool.pylint.master]
fail-under = "9.5"
[tool.pylint.format]
max-line-length = 120
# AddFIPS
AddFIPS is a tool for adding state or county FIPS codes to files that contain just the names of those geographies.
FIPS codes are the official ID numbers of places in the US. They're invaluable for matching data from different sources.
Say you have a CSV file like this:
```
state,county,statistic
IL,Cook,123
California,Los Angeles County,321
New York,Kings,137
LA,Orleans,99
Alaska,Kusilvak,12
```
AddFIPS lets you do this:
```
> addfips --county-field=county input.csv
countyfp,state,county,statistic
17031,IL,Cook,123
06037,California,Los Angeles County,321
36047,New York,Kings,137
22071,LA,Orleans,99
02270,Alaska,Kusilvak,12
```
## Installing
AddFIPS is a Python package compatible with Python 3.
If you've used Python packages before:
```
pip install addfips
# or
pip install --user addfips
```
If you haven't used Python packages before, [get pip](http://pip.readthedocs.org/en/stable/installing/), then come back.
You can also clone the repo and install with `python setup.py install`.
## Features
* Use full names or postal abbrevations for states
* Works with all states, territories, and the District of Columbia
* Slightly fuzzy matching allows for missing diacretic marks and different name formats ("Nye County" or "Nye', "Saint Louis" or "St. Louis", "Prince George's" or "Prince Georges")
* Includes up-to-date 2015 geographies (shout out to Kusilvak Census Area, AK, and Oglala Lakota Co., SD)
Note that some states have counties and county-equivalent independent cities with the same names (e.g. Baltimore city & County, MD, Richmond city & County, VA). AddFIPS's behavior may pick the wrong geography if just the name ("Baltimore") is passed.
## Command line tool
````
usage: addfips [-h] [-V] [-d CHAR] (-s FIELD | -n NAME) [-c FIELD]
[-v VINTAGE] [--no-header]
[input]
AddFIPS codes to a CSV with state and/or county names
positional arguments:
input Input file. default: stdin
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-d CHAR, --delimiter CHAR
field delimiter. default: ,
-s FIELD, --state-field FIELD
Read state name or FIPS code from this field
-n NAME, --state-name NAME
Use this state for all rows
-c FIELD, --county-field FIELD
Read county name from this field. If blank, only state
FIPS code will be added
-v VINTAGE, --vintage VINTAGE
2000, 2010, or 2015. default: 2015
--no-header Input has no header now, interpret fields as integers
-u, --err-unmatched Print rows that addfips cannot match to stderr
````
Options and flags:
* `input`: (positional argument) The name of the file. If blank, `addfips` reads from stdin.
* `--delimiter`: Field delimiter, defaults to ','.
* `--state-field`: Name of the field containing state name
* `--state-name`: Name, postal abbreviation or state FIPS code to use for all rows.
* `--county-field`: Name of the field containing county name. If this is blank, the output will contain the two-character state FIPS code.
* `--vintage`: Use earlier county names and FIPS codes. For instance, Clifton Forge city, VA, is not included in 2010 or later vintages.
* `--no-header`: Indicates that the input file has no header. `--state-field` and `--county-field` are parsed as field indices.
* `--err-unmatched`: Rows that `addfips` cannot match will be printed to stderr, rather than stdout
The output is a CSV with a new column, "fips", appended to the front. When `addfips` cannot make a match, the fips column will have an empty value.
### Examples
Add state FIPS codes:
````
addfips data.csv --state-field fieldName > data_with_fips.csv
````
Add state and county FIPS codes:
````
addfips data.csv --state-field fieldName --county-field countyName > data_with_fips.csv
````
For files with no header row, use a number to refer to the columns with state and/or county names:
```
addfips --no-header-row --state-field 1 --county-field 2 data_no_header.csv > data_no_header_fips.csv
```
Column numbers are one-indexed.
AddFIPS for counties from a specific state. These are equivalent:
```
addfips ny_data.csv -c county --state-name NY > ny_data_fips.csv
addfips ny_data.csv -c county --state-name 'New York' > ny_data_fips.csv
addfips ny_data.csv -c county --state-name 36 > ny_data_fips.csv
```
Use an alternate delimiter:
```
addfips -d'|' -s state pipe_delimited.dsv > result.csv
addfips -d';' -s state semicolon_delimited.dsv > result.csv
```
Print unmatched rows to another file:
```
addfips --err-unmatched -s state state_data.csv > state_data_fips.csv 2> state_unmatched.csv
addfips -u -s STATE -c COUNTY county_data.csv > county_data_fips.csv 2> county_unmatched.csv
```
Pipe from other programs:
````
curl http://example.com/data.csv | addfips -s stateFieldName -c countyField > data_with_fips.csv
csvkit -c state,county,important huge_file.csv | addfips -s state -c county > small_file.csv
````
Pipe to other programs. In files with extensive text, filtering with the FIPS code is safer than using county names, which may be common words (e.g. cook):
````
addfips culinary_data.csv -s stateFieldName -c countyField | grep -e "^17031" > culinary_data_cook_county.csv
addfips -s StateName -c CountyName data.csv | csvsort -c fips > sorted_by_fips.csv
````
## API
AddFIPS is available for use in your Python scripts:
````python
>>> import addfips
>>> af = addfips.AddFIPS()
>>> af.get_state_fips('Puerto Rico')
'72'
>>> af.get_county_fips('Nye', state='Nevada')
'32023'
>>> row = {'county': 'Cook County', 'state': 'IL'}
>>> af.add_county_fips(row, county_field="county", state_field="state")
{'county': 'Cook County', 'state': 'IL', 'fips': '17031'}
````
The results of `AddFIPS.get_state_fips` and `AddFIPS.get_county_fips` are strings, since FIPS codes may have leading zeros.
### Classes
#### AddFIPS(vintage=None)
The AddFIPS class takes one keyword argument, `vintage`, which may be either `2000`, `2010` or `2015`. Any other value will use the most recent vintage. Other vintages may be added in the future.
__get_state_fips(self, state)__
Returns two-digit FIPS code based on a state name or postal code.
__get_county_fips(self, county, state)__
Returns five-digit FIPS code based on county name and state name/abbreviation/FIPS.
__add_state_fips(self, row, state_field='state')__
Returns the input row with a two-figit state FIPS code added.
Input row may be either a `dict` or a `list`. If a `dict`, the 'fips' key is added. If a `list`, the FIPS code is added at the start of the list.
__add_county_fips(self, row, county_field='county', state_field='state', state=None)__
Returns the input row with a five-figit county FIPS code added.
Input row may be either a `dict` or a `list`. If a `dict`, the 'fips' key is added. If a `list`, the FIPS code is added at the start of the list.
### License
Distributed under the GNU General Public License, version 3. See LICENSE for more information.
[egg_info]
tag_build =
tag_date = 0
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# This file is part of addfips.
# http://github.com/fitnr/addfips
# Licensed under the GPL-v3.0 license:
# http://opensource.org/licenses/GPL-3.0
# Copyright (c) 2016, Neil Freeman <contact@fakeisthenewreal.org>
from setuptools import setup
try:
README = open('README.md').read()
except:
README = ''
setup(
name='addfips',
version='0.3.0',
description='Add county FIPS to tabular data',
long_description=README,
long_description_content_type='text/markdown',
keywords='csv census usa data',
author='Neil Freeman',
author_email='contact@fakeisthenewreal.org',
url='http://github.com/fitnr/addfips',
license='GPL-3.0',
classifiers=[
'Development Status :: 4 - Beta',
'Intended Audience :: Developers',
'License :: OSI Approved :: GNU General Public License v3 (GPLv3)',
'Natural Language :: English',
'Operating System :: Unix',
'Programming Language :: Python :: 3.7',
'Programming Language :: Python :: Implementation :: PyPy',
'Operating System :: OS Independent',
],
packages=['addfips'],
package_dir={'': 'src'},
package_data={'addfips': ['data/*.csv']},
include_package_data=True,
entry_points={
'console_scripts': [
'addfips=addfips.__main__:main',
],
},
zip_safe=False,
test_suite='tests',
install_requires=[
"importlib_resources>=2.0.1"
],
python_requires='>=3.6.0',
)
[console_scripts]
addfips = addfips.__main__:main

Sorry, the diff of this file is not supported yet

Metadata-Version: 2.1
Name: addfips
Version: 0.3.0
Summary: Add county FIPS to tabular data
Home-page: http://github.com/fitnr/addfips
Author: Neil Freeman
Author-email: contact@fakeisthenewreal.org
License: GPL-3.0
Description: # AddFIPS
AddFIPS is a tool for adding state or county FIPS codes to files that contain just the names of those geographies.
FIPS codes are the official ID numbers of places in the US. They're invaluable for matching data from different sources.
Say you have a CSV file like this:
```
state,county,statistic
IL,Cook,123
California,Los Angeles County,321
New York,Kings,137
LA,Orleans,99
Alaska,Kusilvak,12
```
AddFIPS lets you do this:
```
> addfips --county-field=county input.csv
countyfp,state,county,statistic
17031,IL,Cook,123
06037,California,Los Angeles County,321
36047,New York,Kings,137
22071,LA,Orleans,99
02270,Alaska,Kusilvak,12
```
## Installing
AddFIPS is a Python package compatible with Python 3.
If you've used Python packages before:
```
pip install addfips
# or
pip install --user addfips
```
If you haven't used Python packages before, [get pip](http://pip.readthedocs.org/en/stable/installing/), then come back.
You can also clone the repo and install with `python setup.py install`.
## Features
* Use full names or postal abbrevations for states
* Works with all states, territories, and the District of Columbia
* Slightly fuzzy matching allows for missing diacretic marks and different name formats ("Nye County" or "Nye', "Saint Louis" or "St. Louis", "Prince George's" or "Prince Georges")
* Includes up-to-date 2015 geographies (shout out to Kusilvak Census Area, AK, and Oglala Lakota Co., SD)
Note that some states have counties and county-equivalent independent cities with the same names (e.g. Baltimore city & County, MD, Richmond city & County, VA). AddFIPS's behavior may pick the wrong geography if just the name ("Baltimore") is passed.
## Command line tool
````
usage: addfips [-h] [-V] [-d CHAR] (-s FIELD | -n NAME) [-c FIELD]
[-v VINTAGE] [--no-header]
[input]
AddFIPS codes to a CSV with state and/or county names
positional arguments:
input Input file. default: stdin
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-d CHAR, --delimiter CHAR
field delimiter. default: ,
-s FIELD, --state-field FIELD
Read state name or FIPS code from this field
-n NAME, --state-name NAME
Use this state for all rows
-c FIELD, --county-field FIELD
Read county name from this field. If blank, only state
FIPS code will be added
-v VINTAGE, --vintage VINTAGE
2000, 2010, or 2015. default: 2015
--no-header Input has no header now, interpret fields as integers
-u, --err-unmatched Print rows that addfips cannot match to stderr
````
Options and flags:
* `input`: (positional argument) The name of the file. If blank, `addfips` reads from stdin.
* `--delimiter`: Field delimiter, defaults to ','.
* `--state-field`: Name of the field containing state name
* `--state-name`: Name, postal abbreviation or state FIPS code to use for all rows.
* `--county-field`: Name of the field containing county name. If this is blank, the output will contain the two-character state FIPS code.
* `--vintage`: Use earlier county names and FIPS codes. For instance, Clifton Forge city, VA, is not included in 2010 or later vintages.
* `--no-header`: Indicates that the input file has no header. `--state-field` and `--county-field` are parsed as field indices.
* `--err-unmatched`: Rows that `addfips` cannot match will be printed to stderr, rather than stdout
The output is a CSV with a new column, "fips", appended to the front. When `addfips` cannot make a match, the fips column will have an empty value.
### Examples
Add state FIPS codes:
````
addfips data.csv --state-field fieldName > data_with_fips.csv
````
Add state and county FIPS codes:
````
addfips data.csv --state-field fieldName --county-field countyName > data_with_fips.csv
````
For files with no header row, use a number to refer to the columns with state and/or county names:
```
addfips --no-header-row --state-field 1 --county-field 2 data_no_header.csv > data_no_header_fips.csv
```
Column numbers are one-indexed.
AddFIPS for counties from a specific state. These are equivalent:
```
addfips ny_data.csv -c county --state-name NY > ny_data_fips.csv
addfips ny_data.csv -c county --state-name 'New York' > ny_data_fips.csv
addfips ny_data.csv -c county --state-name 36 > ny_data_fips.csv
```
Use an alternate delimiter:
```
addfips -d'|' -s state pipe_delimited.dsv > result.csv
addfips -d';' -s state semicolon_delimited.dsv > result.csv
```
Print unmatched rows to another file:
```
addfips --err-unmatched -s state state_data.csv > state_data_fips.csv 2> state_unmatched.csv
addfips -u -s STATE -c COUNTY county_data.csv > county_data_fips.csv 2> county_unmatched.csv
```
Pipe from other programs:
````
curl http://example.com/data.csv | addfips -s stateFieldName -c countyField > data_with_fips.csv
csvkit -c state,county,important huge_file.csv | addfips -s state -c county > small_file.csv
````
Pipe to other programs. In files with extensive text, filtering with the FIPS code is safer than using county names, which may be common words (e.g. cook):
````
addfips culinary_data.csv -s stateFieldName -c countyField | grep -e "^17031" > culinary_data_cook_county.csv
addfips -s StateName -c CountyName data.csv | csvsort -c fips > sorted_by_fips.csv
````
## API
AddFIPS is available for use in your Python scripts:
````python
>>> import addfips
>>> af = addfips.AddFIPS()
>>> af.get_state_fips('Puerto Rico')
'72'
>>> af.get_county_fips('Nye', state='Nevada')
'32023'
>>> row = {'county': 'Cook County', 'state': 'IL'}
>>> af.add_county_fips(row, county_field="county", state_field="state")
{'county': 'Cook County', 'state': 'IL', 'fips': '17031'}
````
The results of `AddFIPS.get_state_fips` and `AddFIPS.get_county_fips` are strings, since FIPS codes may have leading zeros.
### Classes
#### AddFIPS(vintage=None)
The AddFIPS class takes one keyword argument, `vintage`, which may be either `2000`, `2010` or `2015`. Any other value will use the most recent vintage. Other vintages may be added in the future.
__get_state_fips(self, state)__
Returns two-digit FIPS code based on a state name or postal code.
__get_county_fips(self, county, state)__
Returns five-digit FIPS code based on county name and state name/abbreviation/FIPS.
__add_state_fips(self, row, state_field='state')__
Returns the input row with a two-figit state FIPS code added.
Input row may be either a `dict` or a `list`. If a `dict`, the 'fips' key is added. If a `list`, the FIPS code is added at the start of the list.
__add_county_fips(self, row, county_field='county', state_field='state', state=None)__
Returns the input row with a five-figit county FIPS code added.
Input row may be either a `dict` or a `list`. If a `dict`, the 'fips' key is added. If a `list`, the FIPS code is added at the start of the list.
### License
Distributed under the GNU General Public License, version 3. See LICENSE for more information.
Keywords: csv census usa data
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Natural Language :: English
Classifier: Operating System :: Unix
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: Implementation :: PyPy
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6.0
Description-Content-Type: text/markdown
importlib_resources>=2.0.1
MANIFEST.in
README.md
pyproject.toml
setup.py
src/addfips/__init__.py
src/addfips/__main__.py
src/addfips/addfips.py
src/addfips.egg-info/PKG-INFO
src/addfips.egg-info/SOURCES.txt
src/addfips.egg-info/dependency_links.txt
src/addfips.egg-info/entry_points.txt
src/addfips.egg-info/not-zip-safe
src/addfips.egg-info/requires.txt
src/addfips.egg-info/top_level.txt
src/addfips/data/counties_2000.csv
src/addfips/data/counties_2010.csv
src/addfips/data/counties_2015.csv
src/addfips/data/counties_2020.csv
src/addfips/data/states.csv
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# This file is part of addfips.
# http://github.com/fitnr/addfips
# Licensed under the GPL-v3.0 license:
# http://opensource.org/licenses/GPL-3.0
# Copyright (c) 2016, fitnr <fitnr@fakeisthenewreal>
"""
Add FIPS codes to lists and files that contain the names of US state and counties.
"""
from .addfips import AddFIPS
__version__ = '0.3.0'
__all__ = ['addfips']
#!/usr/bin/env python
# This file is part of addfips.
# http://github.com/fitnr/addfips
# Licensed under the GPL-v3.0 license:
# http://opensource.org/licenses/GPL-3.0
# Copyright (c) 2016, fitnr <fitnr@fakeisthenewreal>
"""Add FIPS codes to a CSV with state and/or county names."""
import argparse
import csv
import sys
from signal import SIG_DFL, SIGPIPE, signal
from . import __version__ as version
from .addfips import AddFIPS
def unmatched(result):
"""Check if fips is defined in a result row."""
try:
if result['fips'] is None:
return True
except TypeError:
if result[0] is None:
return True
return False
def main():
"""Add FIPS codes to a CSV with state and/or county names."""
parser = argparse.ArgumentParser(description="Add FIPS codes to a CSV with state and/or county names")
parser.add_argument('-V', '--version', action='version', version='%(prog)s ' + version)
parser.add_argument('input', nargs='?', help='Input file. default: stdin')
parser.add_argument('-d', '--delimiter', metavar='CHAR', type=str, help='field delimiter. default: ,')
group = parser.add_mutually_exclusive_group(required=True)
group.add_argument(
'-s', '--state-field', metavar='FIELD', type=str, help='Read state name or FIPS code from this field'
)
group.add_argument('-n', '--state-name', metavar='NAME', type=str, help='Use this state for all rows')
parser.add_argument(
'-c',
'--county-field',
metavar='FIELD',
type=str,
help='Read county name from this field. If blank, only state FIPS code will be added',
)
parser.add_argument('-v', '--vintage', type=int, help='2000, 2010, or 2015. default: 2015')
parser.add_argument(
'--no-header', action='store_false', dest='header', help='Input has no header now, interpret fields as integers'
)
parser.add_argument(
'-u', '--err-unmatched', action='store_true', help='Print rows that addfips cannot match to stderr'
)
parser.set_defaults(delimiter=',', input='/dev/stdin')
args = parser.parse_args()
addfips = AddFIPS(args.vintage)
kwargs = {
# This may be None, and that's ... OK.
"state_field": args.state_field
}
# Check if we're decoding counties or states.
if args.county_field:
func = addfips.add_county_fips
kwargs["county_field"] = args.county_field
if args.state_name:
kwargs["state"] = args.state_name
else:
func = addfips.add_state_fips
with open(args.input, 'rt') as f:
signal(SIGPIPE, SIG_DFL)
if args.header:
# Read the header, write a header.
reader = csv.DictReader(f, delimiter=args.delimiter)
fields = ['fips'] + reader.fieldnames
writer = csv.DictWriter(sys.stdout, fields)
writer.writeheader()
if args.err_unmatched:
error = csv.DictWriter(sys.stderr, fields)
else:
# Don't read a header, don't write a header.
kwargs['state_field'] = int(kwargs['state_field']) - 1
if 'county_field' in kwargs:
kwargs['county_field'] = int(kwargs.get('county_field')) - 1
reader = csv.reader(f, delimiter=args.delimiter)
writer = csv.writer(sys.stdout)
if args.err_unmatched:
error = csv.writer(sys.stderr)
# Write results, optionally to stderr
for row in reader:
result = func(row, **kwargs)
if args.err_unmatched and unmatched(result):
error.writerow(row)
else:
writer.writerow(result)
if __name__ == '__main__':
main()
# -*- coding: utf-8 -*-
# This file is part of addfips.
# http://github.com/fitnr/addfips
# Licensed under the GPL-v3.0 license:
# http://opensource.org/licenses/GPL-3.0
# Copyright (c) 2016, fitnr <fitnr@fakeisthenewreal>
'''
Add county FIPS code to a CSV that has state and county names.
'''
import csv
import re
from importlib_resources import files
COUNTY_FILES = {
2000: 'data/counties_2000.csv',
2010: 'data/counties_2010.csv',
2015: 'data/counties_2015.csv',
2020: 'data/counties_2015.csv',
}
STATES = 'data/states.csv'
COUNTY_PATTERN = r" (county|city|city and borough|borough|census area|municipio|municipality|district|parish)$"
DIACRETICS = {
r"ñ": "n",
r"'": "",
r"ó": "o",
r"í": "i",
r"á": "a",
r"ü": "u",
r"é": "e",
r"î": "i",
r"è": "e",
r"à": "a",
r"ì": "i",
r"å": "a",
}
ABBREVS = {
'ft. ': 'fort ',
'st. ': 'saint ',
'ste. ': 'sainte ',
}
class AddFIPS:
"""Get state or county FIPS codes"""
default_county_field = 'county'
default_state_field = 'state'
data = files('addfips')
def __init__(self, vintage=None):
# Handle de-diacreticizing
self.diacretic_pattern = '(' + ('|'.join(DIACRETICS)) + ')'
self.delete_diacretics = lambda x: DIACRETICS[x.group()]
if vintage is None or vintage not in COUNTY_FILES:
vintage = max(COUNTY_FILES.keys())
self._states, self._state_fips = self._load_state_data()
self._counties = self._load_county_data(vintage)
def _load_state_data(self):
with self.data.joinpath(STATES).open('rt') as f:
reader = csv.DictReader(f)
states = {}
state_fips = {}
for row in reader:
states[row['postal'].lower()] = row['fips']
states[row['name'].lower()] = row['fips']
state_fips[row['fips']] = row['fips']
state_fips = frozenset(state_fips)
return states, state_fips
def _load_county_data(self, vintage):
with self.data.joinpath(COUNTY_FILES[vintage]).open('rt') as f:
counties = {}
for row in csv.DictReader(f):
if row['statefp'] not in counties:
counties[row['statefp']] = {}
state = counties[row['statefp']]
# Strip diacretics, remove geography name and add both to dict
county = self._delete_diacretics(row['name'].lower())
bare_county = re.sub(COUNTY_PATTERN, '', county)
state[county] = state[bare_county] = row['countyfp']
# Add both versions of abbreviated names to the dict.
for short, full in ABBREVS.items():
needle, replace = None, None
if county.startswith(short):
needle, replace = short, full
elif county.startswith(full):
needle, replace = full, short
if needle is not None:
replaced = county.replace(needle, replace, 1)
bare_replaced = bare_county.replace(needle, replace, 1)
state[replaced] = state[bare_replaced] = row['countyfp']
return counties
def _delete_diacretics(self, string):
return re.sub(self.diacretic_pattern, self.delete_diacretics, string)
def get_state_fips(self, state):
'''Get FIPS code from a state name or postal code'''
if state is None:
return None
# Check if we already have a FIPS code
if state in self._state_fips:
return state
return self._states.get(state.lower())
def get_county_fips(self, county, state):
"""
Get a county's FIPS code.
:county str County name
:state str Name, postal abbreviation or FIPS code for a state
"""
state_fips = self.get_state_fips(state)
counties = self._counties.get(state_fips, {})
try:
name = self._delete_diacretics(county.lower())
return state_fips + counties.get(name)
except TypeError:
return None
def add_state_fips(self, row, state_field=None):
"""
Add state FIPS to a dictionary.
:row dict/list A dictionary with state and county names
:state_field str name of state name field. default: state
"""
if state_field is None:
state_field = self.default_state_field
fips = self.get_state_fips(row[state_field])
try:
row['fips'] = fips
except TypeError:
row.insert(0, fips)
return row
def add_county_fips(self, row, county_field=None, state_field=None, state=None):
"""
Add county FIPS to a dictionary containing a state name, FIPS code, or using a passed state name or FIPS code.
:row dict/list A dictionary with state and county names
:county_field str county name field. default: county
:state_fips_field str state FIPS field containing state fips
:state_field str state name field. default: county
:state str State name, postal abbreviation or FIPS code to use
"""
if state:
state_fips = self.get_state_fips(state)
else:
state_fips = self.get_state_fips(row[state_field or self.default_state_field])
if county_field is None:
county_field = self.default_county_field
fips = self.get_county_fips(row[county_field], state_fips)
try:
row['fips'] = fips
except TypeError:
row.insert(0, fips)
return row

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is too big to display

name,postal,fips
Alabama,AL,01
Alaska,AK,02
Arizona,AZ,04
Arkansas,AR,05
California,CA,06
Colorado,CO,08
Connecticut,CT,09
Delaware,DE,10
Florida,FL,12
Georgia,GA,13
Hawaii,HI,15
Idaho,ID,16
Illinois,IL,17
Indiana,IN,18
Iowa,IA,19
Kansas,KS,20
Kentucky,KY,21
Louisiana,LA,22
Maine,ME,23
Maryland,MD,24
Massachusetts,MA,25
Michigan,MI,26
Minnesota,MN,27
Mississippi,MS,28
Missouri,MO,29
Montana,MT,30
Nebraska,NE,31
Nevada,NV,32
New Hampshire,NH,33
New Jersey,NJ,34
New Mexico,NM,35
New York,NY,36
North Carolina,NC,37
North Dakota,ND,38
Ohio,OH,39
Oklahoma,OK,40
Oregon,OR,41
Pennsylvania,PA,42
Rhode Island,RI,44
Rhode Island and Providence Plantations,RI,44
South Carolina,SC,45
South Dakota,SD,46
Tennessee,TN,47
Texas,TX,48
Utah,UT,49
Vermont,VT,50
Virginia,VA,51
Washington,WA,53
West Virginia,WV,54
Wisconsin,WI,55
Wyoming,WY,56
District of Columbia,DC,11
D.C.,DC,11
American Samoa,AS,60
Guam,GU,66
Northern Mariana Islands,MP,69
Puerto Rico,PR,72
U.S. Virgin Islands,VI,78
Virgin Islands,VI,78
United States Virgin Islands,VI,78
-15
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# This file is part of addfips.
# http://github.com/fitnr/addfips
# Licensed under the GPL-v3.0 license:
# http://opensource.org/licenses/GPL-3.0
# Copyright (c) 2016, fitnr <fitnr@fakeisthenewreal>
from .addfips import AddFIPS
__version__ = '0.2.2'
__all__ = ['addfips']
# -*- coding: utf-8 -*-
# This file is part of addfips.
# http://github.com/fitnr/addfips
# Licensed under the GPL-v3.0 license:
# http://opensource.org/licenses/GPL-3.0
# Copyright (c) 2016, fitnr <fitnr@fakeisthenewreal>
import csv
import re
from pkg_resources import resource_filename
'''
Add county FIPS code to a CSV that has state and county names.
'''
COUNTY_FILES = {
2000: 'data/counties_2000.csv',
2010: 'data/counties_2010.csv',
2015: 'data/counties_2015.csv',
}
STATES = 'data/states.csv'
COUNTY_PATTERN = r' (county|city|city and borough|borough|census area|municipio|municipality|district|parish)$'
class AddFIPS(object):
"""Get state or county FIPS codes"""
default_county_field = 'county'
default_state_field = 'state'
diacretics = {
r"ñ": "n",
r"'": "",
r"ó": "o",
r"í": "i",
r"á": "a",
r"ü": "u",
r"é": "e",
r"î": "i",
r"è": "e",
r"à": "a",
r"ì": "i",
r"å": "a",
}
abbrevs = {
'ft. ': 'fort ',
'st. ': 'saint ',
'ste. ': 'sainte ',
}
def __init__(self, vintage=None):
if vintage is None or vintage not in COUNTY_FILES:
vintage = max(COUNTY_FILES.keys())
# load state data
state_csv = resource_filename('addfips', STATES)
with open(state_csv, 'rt') as f:
s = list(csv.DictReader(f))
postals = dict((row['postal'].lower(), row['fips']) for row in s)
names = dict((row['name'].lower(), row['fips']) for row in s)
fips = dict((row['fips'], row['fips']) for row in s)
self._state_fips = frozenset(fips)
self._states = dict(list(postals.items()) + list(names.items()) + list(fips.items()))
# Handle de-diacreticizing
self.diacretic_pattern = '(' + ('|'.join(self.diacretics)) + ')'
self.delete_diacretics = lambda x: self.diacretics[x.group()]
# load county data
county_csv = resource_filename('addfips', COUNTY_FILES[vintage])
with open(county_csv, 'rt') as f:
self._counties = dict()
for row in csv.DictReader(f):
if row['statefp'] not in self._counties:
self._counties[row['statefp']] = {}
state = self._counties[row['statefp']]
# Strip diacretics, remove geography name and add both to dict
county = self._delete_diacretics(row['name'].lower())
bare_county = re.sub(COUNTY_PATTERN, '', county)
state[county] = state[bare_county] = row['countyfp']
# Add both versions of abbreviated names to the dict.
for short, full in self.abbrevs.items():
needle, replace = None, None
if county.startswith(short):
needle, replace = short, full
elif county.startswith(full):
needle, replace = full, short
if needle is not None:
replaced = county.replace(needle, replace, 1)
bare_replaced = bare_county.replace(needle, replace, 1)
state[replaced] = state[bare_replaced] = row['countyfp']
def _delete_diacretics(self, string):
return re.sub(self.diacretic_pattern, self.delete_diacretics, string)
def get_state_fips(self, state):
'''Get FIPS code from a state name or postal code'''
if state is None:
return None
# Check if we already have a FIPS code
if state in self._state_fips:
return state
return self._states.get(state.lower())
def get_county_fips(self, county, state):
'''
Get a county's FIPS code.
:county str County name
:state str Name, postal abbreviation or FIPS code for a state
'''
state_fips = self.get_state_fips(state)
counties = self._counties.get(state_fips, {})
try:
name = self._delete_diacretics(county.lower())
return state_fips + counties.get(name)
except TypeError:
return None
def add_state_fips(self, row, state_field=None):
'''
Add state FIPS to a dictionary.
:row dict/list A dictionary with state and county names
:state_field str name of state name field. default: state
'''
if state_field is None:
state_field = self.default_state_field
fips = self.get_state_fips(row[state_field])
try:
row['fips'] = fips
except TypeError:
row.insert(0, fips)
return row
def add_county_fips(self, row, county_field=None, state_field=None, state=None):
'''
Add county FIPS to a dictionary containing a state name, FIPS code, or using a passed state name or FIPS code.
:row dict/list A dictionary with state and county names
:county_field str county name field. default: county
:state_fips_field str state FIPS field containing state fips
:state_field str state name field. default: county
:state str State name, postal abbreviation or FIPS code to use
'''
if state:
state_fips = self.get_state_fips(state)
else:
state_fips = self.get_state_fips(row[state_field or self.default_state_field])
if county_field is None:
county_field = self.default_county_field
fips = self.get_county_fips(row[county_field], state_fips)
try:
row['fips'] = fips
except TypeError:
row.insert(0, fips)
return row
# This file is part of addfips.
# http://github.com/fitnr/addfips
# Licensed under the GPL-v3.0 license:
# http://opensource.org/licenses/GPL-3.0
# Copyright (c) 2016, fitnr <fitnr@fakeisthenewreal>
import argparse
import csv
import sys
from signal import signal, SIGPIPE, SIG_DFL
from . import __version__ as version
from .addfips import AddFIPS
def unmatched(result):
try:
if result['fips'] is None:
return True
except TypeError:
if result[0] is None:
return True
return False
def main():
parser = argparse.ArgumentParser(description="Add FIPS codes to a CSV with state and/or county names")
parser.add_argument('-V', '--version', action='version', version='%(prog)s ' + version)
parser.add_argument('input', nargs='?', help='Input file. default: stdin')
parser.add_argument('-d', '--delimiter', metavar='CHAR', type=str, help='field delimiter. default: ,')
group = parser.add_mutually_exclusive_group(required=True)
group.add_argument('-s', '--state-field', metavar='FIELD', type=str,
help='Read state name or FIPS code from this field')
group.add_argument('-n', '--state-name', metavar='NAME', type=str, help='Use this state for all rows')
parser.add_argument('-c', '--county-field', metavar='FIELD', type=str,
help='Read county name from this field. If blank, only state FIPS code will be added')
parser.add_argument('-v', '--vintage', type=int, help='2000, 2010, or 2015. default: 2015')
parser.add_argument('--no-header', action='store_false', dest='header',
help='Input has no header now, interpret fields as integers')
parser.add_argument('-u', '--err-unmatched', action='store_true', help='Print rows that addfips cannot match to stderr')
parser.set_defaults(delimiter=',', input='/dev/stdin')
args = parser.parse_args()
af = AddFIPS(args.vintage)
kwargs = {
# This may be None, and that's ... OK.
"state_field": args.state_field
}
# Check if we're decoding counties or states.
if args.county_field:
func = af.add_county_fips
kwargs["county_field"] = args.county_field
if args.state_name:
kwargs["state"] = args.state_name
else:
func = af.add_state_fips
with open(args.input, 'rt') as f:
signal(SIGPIPE, SIG_DFL)
if args.header:
# Read the header, write a header.
reader = csv.DictReader(f, delimiter=args.delimiter)
fields = ['fips'] + reader.fieldnames
writer = csv.DictWriter(sys.stdout, fields)
writer.writeheader()
if args.err_unmatched:
error = csv.DictWriter(sys.stderr, fields)
else:
# Don't read a header, don't write a header.
kwargs['state_field'] = int(kwargs['state_field']) - 1
if 'county_field' in kwargs:
kwargs['county_field'] = int(kwargs.get('county_field')) - 1
reader = csv.reader(f, delimiter=args.delimiter)
writer = csv.writer(sys.stdout)
if args.err_unmatched:
error = csv.writer(sys.stderr)
# Write results, optionally to stderr
for row in reader:
result = func(row, **kwargs)
if args.err_unmatched and unmatched(result):
error.writerow(row)
else:
writer.writerow(result)
if __name__ == '__main__':
main()

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is too big to display

name,postal,fips
Alabama,AL,01
Alaska,AK,02
Arizona,AZ,04
Arkansas,AR,05
California,CA,06
Colorado,CO,08
Connecticut,CT,09
Delaware,DE,10
Florida,FL,12
Georgia,GA,13
Hawaii,HI,15
Idaho,ID,16
Illinois,IL,17
Indiana,IN,18
Iowa,IA,19
Kansas,KS,20
Kentucky,KY,21
Louisiana,LA,22
Maine,ME,23
Maryland,MD,24
Massachusetts,MA,25
Michigan,MI,26
Minnesota,MN,27
Mississippi,MS,28
Missouri,MO,29
Montana,MT,30
Nebraska,NE,31
Nevada,NV,32
New Hampshire,NH,33
New Jersey,NJ,34
New Mexico,NM,35
New York,NY,36
North Carolina,NC,37
North Dakota,ND,38
Ohio,OH,39
Oklahoma,OK,40
Oregon,OR,41
Pennsylvania,PA,42
Rhode Island,RI,44
Rhode Island and Providence Plantations,RI,44
South Carolina,SC,45
South Dakota,SD,46
Tennessee,TN,47
Texas,TX,48
Utah,UT,49
Vermont,VT,50
Virginia,VA,51
Washington,WA,53
West Virginia,WV,54
Wisconsin,WI,55
Wyoming,WY,56
District of Columbia,DC,11
D.C.,DC,11
American Samoa,AS,60
Guam,GU,66
Northern Mariana Islands,MP,69
Puerto Rico,PR,72
U.S. Virgin Islands,VI,78
Virgin Islands,VI,78
United States Virgin Islands,VI,78
AddFIPS
=======
AddFIPS is a tool for adding state or county FIPS codes to files that
contain just the names of those geographies.
FIPS codes are the official ID numbers of places in the US. They're
invaluable for matching data from different sources.
Say you have a CSV file like this:
::
state,county,statistic
IL,Cook,123
California,Los Angeles County,321
New York,Kings,137
LA,Orleans,99
Alaska,Kusilvak,12
AddFIPS lets you do this:
::
> addfips --county-field=county input.csv
countyfp,state,county,statistic
17031,IL,Cook,123
06037,California,Los Angeles County,321
36047,New York,Kings,137
22071,LA,Orleans,99
02270,Alaska,Kusilvak,12
Installing
----------
AddFIPS is a Python package, compatible with Python 2.7, Python 3, and
pypy. It has no dependencies outside of Python's standard libraries.
If you've used Python packages before:
::
pip install addfips
# or
pip install --user addfips
If you haven't used Python packages before, `get
pip <http://pip.readthedocs.org/en/stable/installing/>`__, then come
back.
You can also clone the repo and install with
``python setup.py install``.
Features
--------
- Use full names or postal abbrevations for states
- Works with all states, territories, and the District of Columbia
- Slightly fuzzy matching allows for missing diacretic marks and
different name formats ("Nye County" or "Nye', "Saint Louis" or "St.
Louis", "Prince George's" or "Prince Georges")
- Includes up-to-date 2015 geographies (shout out to Kusilvak Census
Area, AK, and Oglala Lakota Co., SD)
Note that some states have counties and county-equivalent independent
cities with the same names (e.g. Baltimore city & County, MD, Richmond
city & County, VA). AddFIPS's behavior may pick the wrong geography if
just the name ("Baltimore") is passed.
Command line tool
-----------------
::
usage: addfips [-h] [-V] [-d CHAR] (-s FIELD | -n NAME) [-c FIELD]
[-v VINTAGE] [--no-header]
[input]
AddFIPS codes to a CSV with state and/or county names
positional arguments:
input Input file. default: stdin
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-d CHAR, --delimiter CHAR
field delimiter. default: ,
-s FIELD, --state-field FIELD
Read state name or FIPS code from this field
-n NAME, --state-name NAME
Use this state for all rows
-c FIELD, --county-field FIELD
Read county name from this field. If blank, only state
FIPS code will be added
-v VINTAGE, --vintage VINTAGE
2000, 2010, or 2015. default: 2015
--no-header Input has no header now, interpret fields as integers
-u, --err-unmatched Print rows that addfips cannot match to stderr
Options and flags: \* ``input``: (positional argument) The name of the
file. If blank, ``addfips`` reads from stdin. \* ``--delimiter``: Field
delimiter, defaults to ','. \* ``--state-field``: Name of the field
containing state name \* ``--state-name``: Name, postal abbreviation or
state FIPS code to use for all rows. \* ``--county-field``: Name of the
field containing county name. If this is blank, the output will contain
the two-character state FIPS code. \* ``--vintage``: Use earlier county
names and FIPS codes. For instance, Clifton Forge city, VA, is not
included in 2010 or later vintages. \* ``--no-header``: Indicates that
the input file has no header. ``--state-field`` and ``--county-field``
are parsed as field indices. \* ``--err-unmatched``: Rows that
``addfips`` cannot match will be printed to stderr, rather than stdout
The output is a CSV with a new column, "fips", appended to the front.
When ``addfips`` cannot make a match, the fips column will have an empty
value.
Examples
~~~~~~~~
Add state FIPS codes:
::
addfips data.csv --state-field fieldName > data_with_fips.csv
Add state and county FIPS codes:
::
addfips data.csv --state-field fieldName --county-field countyName > data_with_fips.csv
For files with no header row, use a number to refer to the columns with
state and/or county names:
::
addfips --no-header-row --state-field 1 --county-field 2 data_no_header.csv > data_no_header_fips.csv
Column numbers are one-indexed.
AddFIPS for counties from a specific state. These are equivalent:
::
addfips ny_data.csv -c county --state-name NY > ny_data_fips.csv
addfips ny_data.csv -c county --state-name 'New York' > ny_data_fips.csv
addfips ny_data.csv -c county --state-name 36 > ny_data_fips.csv
Use an alternate delimiter:
::
addfips -d'|' -s state pipe_delimited.dsv > result.csv
addfips -d';' -s state semicolon_delimited.dsv > result.csv
Print unmatched rows to another file:
::
addfips --err-unmatched -s state state_data.csv > state_data_fips.csv 2> state_unmatched.csv
addfips -u -s STATE -c COUNTY county_data.csv > county_data_fips.csv 2> county_unmatched.csv
Pipe from other programs:
::
curl http://example.com/data.csv | addfips -s stateFieldName -c countyField > data_with_fips.csv
csvkit -c state,county,important huge_file.csv | addfips -s state -c county > small_file.csv
Pipe to other programs. In files with extensive text, filtering with the
FIPS code is safer than using county names, which may be common words
(e.g. cook):
::
addfips culinary_data.csv -s stateFieldName -c countyField | grep -e "^17031" > culinary_data_cook_county.csv
addfips -s StateName -c CountyName data.csv | csvsort -c fips > sorted_by_fips.csv
API
---
AddFIPS is available for use in your Python scripts:
.. code:: python
>>> import addfips
>>> af = addfips.AddFIPS()
>>> af.get_state_fips('Puerto Rico')
'72'
>>> af.get_county_fips('Nye', state_name='Nevada')
'32023'
>>> row = {'county': 'Cook County', 'state': 'IL'}
>>> af.add_county_fips(row, county_field="county", state_field="state")
{'county': 'Cook County', 'state': 'IL', 'fips': '17031'}
The results of ``AddFIPS.get_state_fips`` and
``AddFIPS.get_county_fips`` are strings, since FIPS codes may have
leading zeros.
Classes
~~~~~~~
AddFIPS(vintage=None)
^^^^^^^^^^^^^^^^^^^^^
The AddFIPS class takes one keyword argument, ``vintage``, which may be
either ``2000``, ``2010`` or ``2015``. Any other value will use the most
recent vintage. Other vintages may be added in the future.
**get\_state\_fips(self, state)** Returns two-digit FIPS code based on a
state name or postal code.
**get\_county\_fips(self, county, state)** Returns five-digit FIPS code
based on county name and state name/abbreviation/FIPS.
**add\_state\_fips(self, row, state\_field='state')** Returns the input
row with a two-figit state FIPS code added. Input row may be either a
``dict`` or a ``list``. If a ``dict``, the 'fips' key is added. If a
``list``, the FIPS code is added at the start of the list.
**add\_county\_fips(self, row, county\_field='county',
state\_field='state', state=None)** Returns the input row with a
five-figit county FIPS code added. Input row may be either a ``dict`` or
a ``list``. If a ``dict``, the 'fips' key is added. If a ``list``, the
FIPS code is added at the start of the list.
License
~~~~~~~
Distributed under the GNU General Public License, version 3. See LICENSE
for more information.
[console_scripts]
addfips = addfips.cli:main
Metadata-Version: 2.0
Name: addfips
Version: 0.2.2
Summary: Add county FIPS to tabular data
Home-page: http://github.com/fitnr/addfips
Author: Neil Freeman
Author-email: contact@fakeisthenewreal.org
License: GPL-3.0
Keywords: csv census usa data
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Natural Language :: English
Classifier: Operating System :: Unix
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: Implementation :: PyPy
Classifier: Operating System :: OS Independent
AddFIPS
=======
AddFIPS is a tool for adding state or county FIPS codes to files that
contain just the names of those geographies.
FIPS codes are the official ID numbers of places in the US. They're
invaluable for matching data from different sources.
Say you have a CSV file like this:
::
state,county,statistic
IL,Cook,123
California,Los Angeles County,321
New York,Kings,137
LA,Orleans,99
Alaska,Kusilvak,12
AddFIPS lets you do this:
::
> addfips --county-field=county input.csv
countyfp,state,county,statistic
17031,IL,Cook,123
06037,California,Los Angeles County,321
36047,New York,Kings,137
22071,LA,Orleans,99
02270,Alaska,Kusilvak,12
Installing
----------
AddFIPS is a Python package, compatible with Python 2.7, Python 3, and
pypy. It has no dependencies outside of Python's standard libraries.
If you've used Python packages before:
::
pip install addfips
# or
pip install --user addfips
If you haven't used Python packages before, `get
pip <http://pip.readthedocs.org/en/stable/installing/>`__, then come
back.
You can also clone the repo and install with
``python setup.py install``.
Features
--------
- Use full names or postal abbrevations for states
- Works with all states, territories, and the District of Columbia
- Slightly fuzzy matching allows for missing diacretic marks and
different name formats ("Nye County" or "Nye', "Saint Louis" or "St.
Louis", "Prince George's" or "Prince Georges")
- Includes up-to-date 2015 geographies (shout out to Kusilvak Census
Area, AK, and Oglala Lakota Co., SD)
Note that some states have counties and county-equivalent independent
cities with the same names (e.g. Baltimore city & County, MD, Richmond
city & County, VA). AddFIPS's behavior may pick the wrong geography if
just the name ("Baltimore") is passed.
Command line tool
-----------------
::
usage: addfips [-h] [-V] [-d CHAR] (-s FIELD | -n NAME) [-c FIELD]
[-v VINTAGE] [--no-header]
[input]
AddFIPS codes to a CSV with state and/or county names
positional arguments:
input Input file. default: stdin
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-d CHAR, --delimiter CHAR
field delimiter. default: ,
-s FIELD, --state-field FIELD
Read state name or FIPS code from this field
-n NAME, --state-name NAME
Use this state for all rows
-c FIELD, --county-field FIELD
Read county name from this field. If blank, only state
FIPS code will be added
-v VINTAGE, --vintage VINTAGE
2000, 2010, or 2015. default: 2015
--no-header Input has no header now, interpret fields as integers
-u, --err-unmatched Print rows that addfips cannot match to stderr
Options and flags: \* ``input``: (positional argument) The name of the
file. If blank, ``addfips`` reads from stdin. \* ``--delimiter``: Field
delimiter, defaults to ','. \* ``--state-field``: Name of the field
containing state name \* ``--state-name``: Name, postal abbreviation or
state FIPS code to use for all rows. \* ``--county-field``: Name of the
field containing county name. If this is blank, the output will contain
the two-character state FIPS code. \* ``--vintage``: Use earlier county
names and FIPS codes. For instance, Clifton Forge city, VA, is not
included in 2010 or later vintages. \* ``--no-header``: Indicates that
the input file has no header. ``--state-field`` and ``--county-field``
are parsed as field indices. \* ``--err-unmatched``: Rows that
``addfips`` cannot match will be printed to stderr, rather than stdout
The output is a CSV with a new column, "fips", appended to the front.
When ``addfips`` cannot make a match, the fips column will have an empty
value.
Examples
~~~~~~~~
Add state FIPS codes:
::
addfips data.csv --state-field fieldName > data_with_fips.csv
Add state and county FIPS codes:
::
addfips data.csv --state-field fieldName --county-field countyName > data_with_fips.csv
For files with no header row, use a number to refer to the columns with
state and/or county names:
::
addfips --no-header-row --state-field 1 --county-field 2 data_no_header.csv > data_no_header_fips.csv
Column numbers are one-indexed.
AddFIPS for counties from a specific state. These are equivalent:
::
addfips ny_data.csv -c county --state-name NY > ny_data_fips.csv
addfips ny_data.csv -c county --state-name 'New York' > ny_data_fips.csv
addfips ny_data.csv -c county --state-name 36 > ny_data_fips.csv
Use an alternate delimiter:
::
addfips -d'|' -s state pipe_delimited.dsv > result.csv
addfips -d';' -s state semicolon_delimited.dsv > result.csv
Print unmatched rows to another file:
::
addfips --err-unmatched -s state state_data.csv > state_data_fips.csv 2> state_unmatched.csv
addfips -u -s STATE -c COUNTY county_data.csv > county_data_fips.csv 2> county_unmatched.csv
Pipe from other programs:
::
curl http://example.com/data.csv | addfips -s stateFieldName -c countyField > data_with_fips.csv
csvkit -c state,county,important huge_file.csv | addfips -s state -c county > small_file.csv
Pipe to other programs. In files with extensive text, filtering with the
FIPS code is safer than using county names, which may be common words
(e.g. cook):
::
addfips culinary_data.csv -s stateFieldName -c countyField | grep -e "^17031" > culinary_data_cook_county.csv
addfips -s StateName -c CountyName data.csv | csvsort -c fips > sorted_by_fips.csv
API
---
AddFIPS is available for use in your Python scripts:
.. code:: python
>>> import addfips
>>> af = addfips.AddFIPS()
>>> af.get_state_fips('Puerto Rico')
'72'
>>> af.get_county_fips('Nye', state_name='Nevada')
'32023'
>>> row = {'county': 'Cook County', 'state': 'IL'}
>>> af.add_county_fips(row, county_field="county", state_field="state")
{'county': 'Cook County', 'state': 'IL', 'fips': '17031'}
The results of ``AddFIPS.get_state_fips`` and
``AddFIPS.get_county_fips`` are strings, since FIPS codes may have
leading zeros.
Classes
~~~~~~~
AddFIPS(vintage=None)
^^^^^^^^^^^^^^^^^^^^^
The AddFIPS class takes one keyword argument, ``vintage``, which may be
either ``2000``, ``2010`` or ``2015``. Any other value will use the most
recent vintage. Other vintages may be added in the future.
**get\_state\_fips(self, state)** Returns two-digit FIPS code based on a
state name or postal code.
**get\_county\_fips(self, county, state)** Returns five-digit FIPS code
based on county name and state name/abbreviation/FIPS.
**add\_state\_fips(self, row, state\_field='state')** Returns the input
row with a two-figit state FIPS code added. Input row may be either a
``dict`` or a ``list``. If a ``dict``, the 'fips' key is added. If a
``list``, the FIPS code is added at the start of the list.
**add\_county\_fips(self, row, county\_field='county',
state\_field='state', state=None)** Returns the input row with a
five-figit county FIPS code added. Input row may be either a ``dict`` or
a ``list``. If a ``dict``, the 'fips' key is added. If a ``list``, the
FIPS code is added at the start of the list.
License
~~~~~~~
Distributed under the GNU General Public License, version 3. See LICENSE
for more information.
{"generator": "bdist_wheel (0.26.0)", "summary": "Add county FIPS to tabular data", "classifiers": ["Development Status :: 4 - Beta", "Intended Audience :: Developers", "License :: OSI Approved :: GNU General Public License v3 (GPLv3)", "Natural Language :: English", "Operating System :: Unix", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: Implementation :: PyPy", "Operating System :: OS Independent"], "extensions": {"python.details": {"project_urls": {"Home": "http://github.com/fitnr/addfips"}, "contacts": [{"email": "contact@fakeisthenewreal.org", "name": "Neil Freeman", "role": "author"}], "document_names": {"description": "DESCRIPTION.rst"}}, "python.exports": {"console_scripts": {"addfips": "addfips.cli:main"}}, "python.commands": {"wrap_console": {"addfips": "addfips.cli:main"}}}, "keywords": ["csv", "census", "usa", "data"], "license": "GPL-3.0", "metadata_version": "2.0", "name": "addfips", "version": "0.2.2"}
addfips/__init__.py,sha256=Vtah1t-1gY9RBXneEkVwd39JkU2U5WeQCC1qaXxhFeE,323
addfips/addfips.py,sha256=toF7SznDKUsJIakgTdDfOWvuE5JDHeJdourLNRT2F4E,5745
addfips/cli.py,sha256=liZG0sG6DPBampzUEmcyMJfIYEv1afCuz8V9vzRUt8E,3532
addfips/data/counties_2000.csv,sha256=9Np5iCe-IjtNd6FGcPaZPsOAYFm5k8vyAgkNjllVqh4,71710
addfips/data/counties_2010.csv,sha256=kAeqcC52VOEc--kkM8kgMIcX4LjnEzl-kpqbg31sUl4,71821
addfips/data/counties_2015.csv,sha256=pwt1G1x2Z3YdO3HL_C1CFHVNrPA0qV9ObMVAkj2mN2I,71885
addfips/data/states.csv,sha256=Lg7QLR6lFxQHDyUwH9WBthCFtMPTz7XTWwYFQVZ0_10,1036
addfips-0.2.2.dist-info/DESCRIPTION.rst,sha256=sneC13OcHotWXceLzIukDqPOSMQsvaJk3SgDHwvgB80,7582
addfips-0.2.2.dist-info/METADATA,sha256=C80RqBCUuxpAjLkcluGgO0fkWed-UzLxwcJsVdNnr54,8354
addfips-0.2.2.dist-info/RECORD,,
addfips-0.2.2.dist-info/WHEEL,sha256=GrqQvamwgBV4nLoJe0vhYRSWzWsx7xjlt74FT0SWYfE,110
addfips-0.2.2.dist-info/entry_points.txt,sha256=VtFu9ohRK5YeR0pkkowzFdRZuhnqi5ThC3iy5vtj9ms,46
addfips-0.2.2.dist-info/metadata.json,sha256=j08FgPN-fAP3l29Tia8zAb0ZG1qXRXyc38Qn8cWHu5A,1043
addfips-0.2.2.dist-info/top_level.txt,sha256=kZm--hklowyIyvGttMvEUEsV_cXlgc_XRThCa3kN27c,8
addfips
Wheel-Version: 1.0
Generator: bdist_wheel (0.26.0)
Root-Is-Purelib: true
Tag: py2-none-any
Tag: py3-none-any