Security News
vlt Debuts New JavaScript Package Manager and Serverless Registry at NodeConf EU
vlt introduced its new package manager and a serverless registry this week, innovating in a space where npm has stagnated.
csv-parser
Advanced tools
Streaming CSV parser that aims for maximum speed as well as compatibility with the csv-spectrum test suite
The csv-parser npm package is a tool for parsing CSV files and streams in Node.js. It converts CSV data into readable streams of JSON objects, allowing for easy manipulation and processing of CSV data within a Node.js application.
Parsing CSV Files
This feature allows you to parse CSV files into JSON objects. The code sample demonstrates how to read a CSV file using a readable stream, parse it with csv-parser, and collect the resulting JSON objects into an array.
const csv = require('csv-parser');
const fs = require('fs');
const results = [];
fs.createReadStream('data.csv')
.pipe(csv())
.on('data', (data) => results.push(data))
.on('end', () => {
console.log(results);
// Handle the parsed data here
});
Custom Headers
This feature allows you to specify custom headers for the CSV data if the CSV file does not contain a header row. The code sample shows how to define a custom set of headers that will be used to map the CSV columns to JSON object properties.
const csv = require('csv-parser');
const fs = require('fs');
const results = [];
fs.createReadStream('data.csv')
.pipe(csv({ headers: ['column1', 'column2', 'column3'] }))
.on('data', (data) => results.push(data))
.on('end', () => {
console.log(results);
});
Skip Lines
This feature allows you to skip a specified number of lines at the beginning of the CSV file, which can be useful for skipping metadata or comments. The code sample demonstrates how to skip the first two lines of the CSV file before parsing the data.
const csv = require('csv-parser');
const fs = require('fs');
const results = [];
fs.createReadStream('data.csv')
.pipe(csv({ skipLines: 2 }))
.on('data', (data) => results.push(data))
.on('end', () => {
console.log(results);
});
Papa Parse is a powerful CSV parser that supports browser and server-side parsing. It offers features like auto-detection of delimiters, streaming large files, and parsing local and remote files. Compared to csv-parser, Papa Parse has a broader feature set and can be used in both browser and Node.js environments.
Fast-csv is another CSV parsing and formatting library for Node.js. It provides a simple API for parsing and formatting CSV data and supports both streams and promises. Fast-csv is known for its performance and ease of use, and it is a good alternative to csv-parser with similar stream-based processing capabilities.
csvtojson is a full-featured CSV parser library that converts CSV to JSON. One of its key features is the ability to handle large files and streams efficiently. It also supports custom column parsers and has a slightly different API compared to csv-parser, offering more customization options for the parsing process.
Streaming CSV parser that aims for maximum speed as well as compatibility with the csv-spectrum CSV acid test suite.
csv-parser
can convert CSV into JSON at at rate of around 90,000 rows per
second. Performance varies with the data used; try bin/bench.js <your file>
to benchmark your data.
csv-parser
can be used in the browser with browserify.
neat-csv can be used if a Promise
based interface to csv-parser
is needed.
Note: This module requires Node v8.16.0 or higher.
⚡️ csv-parser
is greased-lightning fast
→ npm run bench
Filename Rows Parsed Duration
backtick.csv 2 3.5ms
bad-data.csv 3 0.55ms
basic.csv 1 0.26ms
comma-in-quote.csv 1 0.29ms
comment.csv 2 0.40ms
empty-columns.csv 1 0.40ms
escape-quotes.csv 3 0.38ms
geojson.csv 3 0.46ms
large-dataset.csv 7268 73ms
newlines.csv 3 0.35ms
no-headers.csv 3 0.26ms
option-comment.csv 2 0.24ms
option-escape.csv 3 0.25ms
option-maxRowBytes.csv 4577 39ms
option-newline.csv 0 0.47ms
option-quote-escape.csv 3 0.33ms
option-quote-many.csv 3 0.38ms
option-quote.csv 2 0.22ms
quotes+newlines.csv 3 0.20ms
strict.csv 3 0.22ms
latin.csv 2 0.38ms
mac-newlines.csv 2 0.28ms
utf16-big.csv 2 0.33ms
utf16.csv 2 0.26ms
utf8.csv 2 0.24ms
Using npm:
$ npm install csv-parser
Using yarn:
$ yarn add csv-parser
To use the module, create a readable stream to a desired CSV file, instantiate
csv
, and pipe the stream to csv
.
Suppose you have a CSV file data.csv
which contains the data:
NAME,AGE
Daffy Duck,24
Bugs Bunny,22
It could then be parsed, and results shown like so:
const csv = require('csv-parser')
const fs = require('fs')
const results = [];
fs.createReadStream('data.csv')
.pipe(csv())
.on('data', (data) => results.push(data))
.on('end', () => {
console.log(results);
// [
// { NAME: 'Daffy Duck', AGE: '24' },
// { NAME: 'Bugs Bunny', AGE: '22' }
// ]
});
To specify options for csv
, pass an object argument to the function. For
example:
csv({ separator: '\t' });
Returns: Array[Object]
Type: Object
As an alternative to passing an options
object, you may pass an Array[String]
which specifies the headers to use. For example:
csv(['Name', 'Age']);
If you need to specify options and headers, please use the the object notation
with the headers
property as shown below.
Type: String
Default: "
A single-character string used to specify the character used to escape strings in a CSV row.
Type: Array[String] | Boolean
Specifies the headers to use. Headers define the property key for each value in
a CSV row. If no headers
option is provided, csv-parser
will use the first
line in a CSV file as the header specification.
If false
, specifies that the first row in a data file does not contain
headers, and instructs the parser to use the column index as the key for each column.
Using headers: false
with the same data.csv
example from above would yield:
[
{ '0': 'Daffy Duck', '1': 24 },
{ '0': 'Bugs Bunny', '1': 22 }
]
Note: If using the headers
for an operation on a file which contains headers on the first line, specify skipLines: 1
to skip over the row, or the headers row will appear as normal row data. Alternatively, use the mapHeaders
option to manipulate existing headers in that scenario.
Type: Function
A function that can be used to modify the values of each header. Return a String
to modify the header. Return null
to remove the header, and it's column, from the results.
csv({
mapHeaders: ({ header, index }) => header.toLowerCase()
})
header String The current column header.
index Number The current column index.
Type: Function
A function that can be used to modify the content of each column. The return value will replace the current column content.
csv({
mapValues: ({ header, index, value }) => value.toLowerCase()
})
header String The current column header.
index Number The current column index.
value String The current column value (or content).
Type: String
Default: \n
Specifies a single-character string to denote the end of a line in a CSV file.
Type: String
Default: "
Specifies a single-character string to denote a quoted string.
Type: Boolean
If true
, instructs the parser not to decode UTF-8 strings.
Type: String
Default: ,
Specifies a single-character string to use as the column separator for each row.
Type: Boolean | String
Default: false
Instructs the parser to ignore lines which represent comments in a CSV file. Since there is no specification that dictates what a CSV comment looks like, comments should be considered non-standard. The "most common" character used to signify a comment in a CSV file is "#"
. If this option is set to true
, lines which begin with #
will be skipped. If a custom character is needed to denote a commented line, this option may be set to a string which represents the leading character(s) signifying a comment line.
Type: Number
Default: 0
Specifies the number of lines at the beginning of a data file that the parser should skip over, prior to parsing headers.
Type: Number
Default: Number.MAX_SAFE_INTEGER
Maximum number of bytes per row. An error is thrown if a line exeeds this value. The default value is on 8 peta byte.
Type: Boolean
If true
, instructs the parser that the number of columns in each row must match
the number of headers
specified.
The following events are emitted during parsing:
data
Emitted for each row of data parsed with the notable exception of the header row. Please see Usage for an example.
headers
Emitted after the header row is parsed. The first parameter of the event
callback is an Array[String]
containing the header names.
fs.createReadStream('data.csv')
.pipe(csv())
.on('headers', (headers) => {
console.log(`First header: ${headers[0]}`)
})
Events available on Node built-in
Readable Streams
are also emitted. The end
event should be used to detect the end of parsing.
This module also provides a CLI which will convert CSV to newline-delimited JSON. The following CLI flags can be used to control how input is parsed:
Usage: csv-parser [filename?] [options]
--escape,-e Set the escape character (defaults to quote value)
--headers,-h Explicitly specify csv headers as a comma separated list
--help Show this help
--output,-o Set output file. Defaults to stdout
--quote,-q Set the quote character ('"' by default)
--remove Remove columns from output by header name
--separator,-s Set the separator character ("," by default)
--skipComments,-c Skip CSV comments that begin with '#'. Set a value to change the comment character.
--skipLines,-l Set the number of lines to skip to before parsing headers
--strict Require column length match headers length
--version,-v Print out the installed version
For example; to parse a TSV file:
cat data.tsv | csv-parser -s $'\t'
Users may encounter issues with the encoding of a CSV file. Transcoding the source stream can be done neatly with a modules such as:
Or native iconv
if part
of a pipeline.
Some CSV files may be generated with, or contain a leading Byte Order Mark. This may cause issues parsing headers and/or data from your file. From Wikipedia:
The Unicode Standard permits the BOM in UTF-8, but does not require nor recommend its use. Byte order has no meaning in UTF-8.
To use this module with a file containing a BOM, please use a module like strip-bom-stream in your pipeline:
const fs = require('fs');
const csv = require('csv-parser');
const stripBom = require('strip-bom-stream');
fs.createReadStream('data.csv')
.pipe(stripBom())
.pipe(csv())
...
When using the CLI, the BOM can be removed by first running:
$ sed $'s/\xEF\xBB\xBF//g' data.csv
FAQs
Streaming CSV parser that aims for maximum speed as well as compatibility with the csv-spectrum test suite
The npm package csv-parser receives a total of 975,618 weekly downloads. As such, csv-parser popularity was classified as popular.
We found that csv-parser demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 4 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
vlt introduced its new package manager and a serverless registry this week, innovating in a space where npm has stagnated.
Security News
Research
The Socket Research Team uncovered a malicious Python package typosquatting the popular 'fabric' SSH library, silently exfiltrating AWS credentials from unsuspecting developers.
Security News
At its inaugural meeting, the JSR Working Group outlined plans for an open governance model and a roadmap to enhance JavaScript package management.