
Security News
NVD Quietly Sweeps 100K+ CVEs Into a “Deferred” Black Hole
NVD now marks all pre-2018 CVEs as "Deferred," signaling it will no longer enrich older vulnerabilities, further eroding trust in its data.
.. image:: https://badge.fury.io/py/csvdiff.png :target: http://badge.fury.io/py/csvdiff
.. image:: https://travis-ci.org/larsyencken/csvdiff.png?branch=master :target: https://travis-ci.org/larsyencken/csvdiff
Generate a diff between two CSV files on the command-line.
csvdiff
allows you to compare the semantic contents of two CSV files, ignoring things like row and column ordering in order to get to what's actually changed. This is useful if you're comparing the output of an automatic system from one day to the next, so that you can look at just what's changed.
It's also useful for maintaining patches to third-party data. Diffs generated by csvdiff
are a subset of JSON and can be stored and applied using the matching csvpatch
command. If upstream data changes, you can fetch the new version and re-apply your changes to it easily.
You'll firstly need Python and pip. Then run::
pip install csvdiff
For example, suppose we have a.csv
::
id,name,amount
1,bob,20
2,eva,63
3,sarah,7
4,jeff,19
6,fred,10
After some changes and corrections to the data, we now have b.csv
::
id,name,amount
1,bob,23 <--- changed
3,sarah,7
4,jeff,19
5,mira,81 <--- added
6,fred,13 <--- changed
Now we can ask for a summary of differences::
$ csvdiff --style=summary id a.csv b.csv
1 rows removed (20.0%)
1 rows added (20.0%)
2 rows changed (40.0%)
Or look at the full diff pretty printed, to make it more readable::
$ csvdiff --style=pretty --output=diff.json id a.csv b.csv
$ cat diff.json
{
"_index": [
"id"
],
"added": [
{
"amount": "81",
"id": "5",
"name": "mira"
}
],
"changed": [
{
"fields": {
"amount": {
"from": "20",
"to": "23"
}
},
"key": [
"1"
]
},
{
"fields": {
"amount": {
"from": "10",
"to": "13"
}
},
"key": [
"6"
]
}
],
"removed": [
{
"amount": "63",
"id": "2",
"name": "eva"
}
]
}
If you want to ignore a column from the comparison then you can do so by specifying a comma seperated list of column names to ignore. For example::
$ csvdiff --style=summary --ignore-columns=amount id a.csv b.csv
1 rows removed (20.0%)
1 rows added (20.0%)
0 rows changed (0%)
You can also choose to compare numeric fields only up to a certain number of significant figures. Use negative significant figures for orders of magnitude::
$ csvdiff --style=summary id a.csv c.csv
0 rows removed (0.0%)
0 rows added (0.0%)
2 rows changed (40.0%)
$ csvdiff --style=summary id --significance=-1 a.csv c.csv
files are identical
Diffs generated this way contain all the data that's changed, and can be reapplied later if the original data changes. For example, suppose more data gets added to a.csv
, giving us a-plus.csv
::
id,name,amount
1,bob,20
2,eva,63
3,sarah,7
4,jeff,19
6,fred,10
8,henry,9
We can reapply our changes with the csvpatch
command::
$ csvpatch --input=diff.json --output=b-plus.csv a-plus.csv
$ cat b-plus.csv
id,name,amount
1,bob,23
3,sarah,7
4,jeff,19
5,mira,81
6,fred,13
8,henry,9
This can be useful if you're using csvdiff to transform data that's outside your control. In this case, you maintain the patch file and simply reapply it when the upstream data provider gives you a fresh file.
For more usage options, run csvdiff --help
or csvpatch --help
.
BSD license
0.3.3 (2017-07-20)
* Add the --significance option to limit to significant figures.
0.3.2 (2017-07-20)
0.3.1 (2016-04-20)
* Fix a bug in summary mode.
* Check for rows bleeding into one another.
0.3.0 (2015-01-07)
0.2.0 (2014-12-30)
* Uses click for the command-line interface.
* Drop YAML support in favour of pretty-printed JSON.
* Uses --style option to change output style.
* Provides a full man page.
0.1.0 (2014-03-15)
-k
FAQs
Generate a diff between two CSV files.
We found that csvdiff demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
NVD now marks all pre-2018 CVEs as "Deferred," signaling it will no longer enrich older vulnerabilities, further eroding trust in its data.
Research
Security News
Lazarus-linked threat actors expand their npm malware campaign with new RAT loaders, hex obfuscation, and over 5,600 downloads across 11 packages.
Security News
Safari 18.4 adds support for Iterator Helpers and two other TC39 JavaScript features, bringing full cross-browser coverage to key parts of the ECMAScript spec.