
Security News
NVD Concedes Inability to Keep Pace with Surging CVE Disclosures in 2025
Security experts warn that recent classification changes obscure the true scope of the NVD backlog as CVE volume hits all-time highs.
Convert a CSV to a parquet file. You may also find sqlite-parquet-vtable or parquet-metadata useful.
If you just want to use the tool:
sudo pip install pyarrow csv2parquet
If you want to clone the repo and work on the tool, install its dependencies via pipenv:
pipenv install
Next, create some Parquet files. The tool supports CSV and TSV files.
usage: csv2parquet [-h] [-n ROWS] [-r ROW_GROUP_SIZE] [-o OUTPUT] [-c CODEC]
[-i INCLUDE [INCLUDE ...] | -x EXCLUDE [EXCLUDE ...]]
[-R RENAME [RENAME ...]] [-t TYPE [TYPE ...]]
csv_file
positional arguments:
csv_file input file, can be CSV or TSV
optional arguments:
-h, --help show this help message and exit
-n ROWS, --rows ROWS The number of rows to include, useful for testing.
-r ROW_GROUP_SIZE, --row-group-size ROW_GROUP_SIZE
The number of rows per row group.
-o OUTPUT, --output OUTPUT
The parquet file
-c CODEC, --codec CODEC
The compression codec to use (brotli, gzip, snappy,
zstd, none)
-i INCLUDE [INCLUDE ...], --include INCLUDE [INCLUDE ...]
Include the given columns (by index or name)
-x EXCLUDE [EXCLUDE ...], --exclude EXCLUDE [EXCLUDE ...]
Exclude the given columns (by index or name)
-R RENAME [RENAME ...], --rename RENAME [RENAME ...]
Rename a column. Specify the column to be renamed and
its new name, eg: 0=age or person_age=age
-t TYPE [TYPE ...], --type TYPE [TYPE ...]
Parse a column as a given type. Specify the column and
its type, eg: 0=bool? or person_age=int8. Parse errors
are fatal unless the type is followed by a question
mark. Valid types are string (default), base64, bool,
float32, float64, int8, int16, int32, int64, timestamp
pylint csv2parquet
pytest
FAQs
A tool to convert CSVs to Parquet files
We found that csv2parquet demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Security experts warn that recent classification changes obscure the true scope of the NVD backlog as CVE volume hits all-time highs.
Security Fundamentals
Attackers use obfuscation to hide malware in open source packages. Learn how to spot these techniques across npm, PyPI, Maven, and more.
Security News
Join Socket for exclusive networking events, rooftop gatherings, and one-on-one meetings during BSidesSF and RSA 2025 in San Francisco.