
Research
NPM targeted by malware campaign mimicking familiar library names
Socket uncovered npm malware campaign mimicking popular Node.js libraries and packages from other ecosystems; packages steal data and execute remote code.
Parquet-Py is a simple command-line interface & Python API designed to facilitate the interaction with Parquet files. It allows users to convert Parquet files into CSV, JSON, lists, and iterators for easy manipulation and access in Python applications.
Using Rust bindings under the hood, Parquet-Py provides a fast and efficient way to work with Parquet files, making it ideal for converting or processing large datasets.
pip install parquet-py
[!WARNING]
The CLI is still under development and may not be fully functional.
Breaking changes may occur in future releases.
[!TIP]
Multiple input files can be specified with
--input
option. For example,--input file1.parquet --input file2.parquet
.
To convert a Parquet file into a CSV file, use the parq convert
command.
parq convert --input path/to/your/file.parquet --format csv --output example.csv
To convert a Parquet file into a JSON Array, use the parq convert
command.
parq convert --input path/to/your/file.parquet --format json --output example.json
To convert a Parquet file into a JSON Lines, use the parq convert
command.
parq convert --input path/to/your/file.parquet --format jsonl --output example.jsonl
To iterate over the rows of a Parquet file, use the iter_rows
function. This allows for efficient row-by-row processing without loading the entire file into memory.
from parq import to_iter
# Path to your Parquet file
file_path = "path/to/your/file.parquet"
# Iterate over Parquet rows
for row in to_iter(file_path):
print(row)
To convert a Parquet file into a CSV string, use the to_csv_str
function.
from parq import to_csv_str
# Path to your Parquet file
file_path = "path/to/your/file.parquet"
# Convert to CSV string
csv_str = to_csv_str(file_path)
print(csv_str)
To convert a Parquet file into a JSON string, use the to_json_str
function.
from parq import to_json_str
# Path to your Parquet file
file_path = "path/to/your/file.parquet"
# Convert to JSON string
json_str = to_json_str(file_path)
print(json_str)
To convert a Parquet file into a Python list, where each row is represented as a dictionary within the list, use the to_list
function.
from parq import to_list
# Path to your Parquet file
file_path = "path/to/your/file.parquet"
# Convert to Python list
data_list = to_list(file_path)
print(len(data_list))
FAQs
A simple command-line interface & Python API for parquet
We found that parquet-py demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Socket uncovered npm malware campaign mimicking popular Node.js libraries and packages from other ecosystems; packages steal data and execute remote code.
Research
Socket's research uncovers three dangerous Go modules that contain obfuscated disk-wiping malware, threatening complete data loss.
Research
Socket uncovers malicious packages on PyPI using Gmail's SMTP protocol for command and control (C2) to exfiltrate data and execute commands.