
Research
2025 Report: Destructive Malware in Open Source Packages
Destructive malware is rising across open source registries, using delays and kill switches to wipe code, break builds, and disrupt CI/CD.
streamfeed-parser
Advanced tools
A lightweight streaming parser for CSV and XML feeds over HTTP/FTP with automatic compression handling.
A lightweight streaming parser for CSV and XML feeds over HTTP/FTP with automatic compression handling. Designed to efficiently process large data feeds without loading the entire file into memory.
pip install streamfeed-parser
from streamfeed import stream_feed, preview_feed
# Preview the first 10 rows from a feed
preview_data = preview_feed('https://example.com/large-feed.csv', limit_rows=10)
print(preview_data)
# Stream and process a large feed without memory constraints
for record in stream_feed('https://example.com/large-feed.csv'):
# Process each record individually
print(record)
The main function for streaming data is stream_feed:
from streamfeed import stream_feed
# Stream a CSV feed
for record in stream_feed('https://example.com/products.csv'):
print(record) # Record is a dictionary with column names as keys
# Stream an XML feed (default item tag is 'product')
for record in stream_feed('https://example.com/products.xml'):
print(record) # Record is a dictionary with XML elements as keys
To preview the first few records without processing the entire feed:
from streamfeed import preview_feed
# Get the first 100 records (default)
preview_data = preview_feed('https://example.com/large-feed.csv')
# Customize the number of records
preview_data = preview_feed('https://example.com/large-feed.csv', limit_rows=10)
You can customize how feeds are processed with the feed_logic parameter:
from streamfeed import stream_feed
# Specify the XML item tag for XML feeds
feed_logic = {
'xml_item_tag': 'item' # Default is 'product'
}
for record in stream_feed('https://example.com/feed.xml', feed_logic=feed_logic):
print(record)
# Explode comma-separated values into multiple records
feed_logic = {
'explode_fields': ['size', 'color'], # Fields to explode
'divider': ',' # Character that separates values (default is ',')
}
# Input: {'id': '123', 'size': 'S,M,L', 'color': 'red,blue,green'}
# Output: Multiple records with each size-color combination
for record in stream_feed('https://example.com/feed.csv', feed_logic=feed_logic):
print(record)
The library handles FTP URLs seamlessly:
from streamfeed import stream_feed
# Basic FTP
for record in stream_feed('ftp://example.com/path/to/feed.csv'):
print(record)
# FTP with authentication (included in URL)
for record in stream_feed('ftp://username:password@example.com/feed.csv'):
print(record)
The library automatically detects and handles compressed feeds:
from streamfeed import stream_feed
# These will automatically be decompressed
for record in stream_feed('https://example.com/feed.csv.gz'): # GZIP
print(record)
for record in stream_feed('https://example.com/feed.csv.zip'): # ZIP
print(record)
for record in stream_feed('https://example.com/feed.xml.bz2'): # BZ2
print(record)
Limit the number of rows processed:
from streamfeed import stream_feed
# Only process the first 1000 rows
for record in stream_feed('https://example.com/large-feed.csv', limit_rows=1000):
print(record)
Limit the maximum length of fields to prevent memory issues:
from streamfeed import stream_feed
# Limit each field to 10,000 characters
for record in stream_feed('https://example.com/feed.csv', max_field_length=10000):
print(record)
For more specialized needs, you can access the underlying functions:
from streamfeed import detect_compression
from streamfeed import stream_csv_lines
from streamfeed import stream_xml_items_iterparse
from streamfeed import stream_from_ftp
# Example: Check compression type
compression = detect_compression('https://example.com/feed.csv.gz')
print(compression) # 'gz'
The library gracefully handles many common errors in feeds:
Errors are logged but processing continues when possible.
Contributions are welcome! Please feel free to submit a Pull Request to the GitHub repository.
git checkout -b feature/amazing-feature)git commit -m 'Add some amazing feature')git push origin feature/amazing-feature)This project is licensed under the terms included in the LICENSE file.
Hans-Christian Bøge Pedersen - devwithhans
FAQs
A lightweight streaming parser for CSV and XML feeds over HTTP/FTP with automatic compression handling.
We found that streamfeed-parser demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Research
Destructive malware is rising across open source registries, using delays and kill switches to wipe code, break builds, and disrupt CI/CD.

Security News
Socket CTO Ahmad Nassri shares practical AI coding techniques, tools, and team workflows, plus what still feels noisy and why shipping remains human-led.

Research
/Security News
A five-month operation turned 27 npm packages into durable hosting for browser-run lures that mimic document-sharing portals and Microsoft sign-in, targeting 25 organizations across manufacturing, industrial automation, plastics, and healthcare for credential theft.