Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement
Sign In

zipstream-ai

Package Overview
Dependencies
Maintainers
3
Versions
6
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

zipstream-ai

Stream and query zipped datasets using LLMs

pipPyPI
Version
1.0.1
Maintainers
3
zipstream-ai logo

PyPI - Python Version PyPI Conda License Tests mypy

Stream, Parse, and Chat with Compressed Datasets Using LLMs

zipstream-ai is a Python package that lets you interact with .zip and .tar.gz files directly—no need to extract them manually. It integrates archive streaming, format detection, data parsing (e.g., CSV, JSON), and natural language querying with LLMs like Gemini, all through a unified interface.

Installation

pip install zipstream-ai

Option 2: Install from Conda

# Install from conda
conda install -c pranav_motarwar zipstream-ai

# Install PyPI-only dependencies (required)
pip install openai typer python-dotenv google-generativeai

Note: The conda package includes core dependencies, but you'll need to install PyPI-only dependencies (openai, typer, python-dotenv, google-generativeai) separately via pip.

Features

FeatureDescription
Archive StreamingStream .zip and .tar.gz files without extraction
Format Auto-DetectionAutomatically detects file types (CSV, JSON, TXT, etc.)
DataFrame IntegrationParses tabular data directly into pandas DataFrames
LLM QueryingAsk questions about your data using Gemini (Google's LLM)
Modular DesignEasily extensible for new formats or models
Python + CLI SupportUse via command line or as a Python package

Use Case Examples

1. Load & Explore ZIP

from zipstream_ai import ZipStreamReader

reader = ZipStreamReader("dataset.zip")
print(reader.list_files())

2. Parse CSV from ZIP

from zipstream_ai import FileParser

parser = FileParser(reader)
df = parser.load("data.csv")
print(df.head())

3. Ask Questions with Gemini

from zipstream_ai import ask

response = ask(df, "Which 3 rows have the highest 'score'?")
print(response)

Why zipstream-ai?

Traditional WorkflowWith zipstream-ai
Manually unzip filesStream directly from archive
Write boilerplate code to parseBuilt-in file parsers (CSV, JSON, etc.)
Switch between tools for LLMsOne-liner ask(df, question) integration

Architecture Diagram

         ┌──────────────┐
         │  .zip/.tar   │
         └────┬─────────┘
              │
   ┌──────────▼──────────┐
   │  ZipStreamReader    │
   └──────────┬──────────┘
              │
     ┌────────▼────────┐
     │   FileParser    │────>  pd.DataFrame
     └────────┬────────┘
              │
     ┌────────▼────────┐
     │     ask()       │────> Gemini LLM Output
     └─────────────────┘

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts