VXDF Python Library

VXDF (Vector eXchange Data Format) is an AI-native container for text, metadata and vector embeddings—portable, indexable and compressed. If you do RAG, semantic search or compliance audits, VXDF gives you one file, one command.
Quick-start
pip install vxdf[zstd]
python - << 'PY'
from vxdf import VXDFWriter, VXDFReader
data = [
{"id": "1", "text": "hello", "vector": [0.1, 0.2]},
{"id": "2", "text": "world", "vector": [0.3, 0.4]},
]
with VXDFWriter("demo.vxdf", embedding_dim=2, compression="zstd") as w:
for chunk in data:
w.add_chunk(chunk)
a = VXDFReader("demo.vxdf")
print(a.get_chunk("2"))
PY
Command-line
vxdf pack data.jsonl data.vxdf --compression zstd
vxdf info data.vxdf
vxdf list data.vxdf | head
vxdf get data.vxdf some-id > doc.json
cat report.txt | vxdf convert - - > report.vxdf
Colab / Notebook

LangChain integration (preview)
from langchain_community.vectorstores import VXDF
vs = VXDF.from_vxdf("demo.vxdf")
See examples/langchain_integration.py
for a minimal adapter.
Authentication
VXDF commands that interact with cloud services need credentials.
OpenAI embeddings
The client looks for an API key in this order (first match wins):
--openai-key
CLI flag (e.g. vxdf convert my.pdf out.vxdf --model openai --openai-key sk-...
)
OPENAI_API_KEY
environment variable.
~/.vxdf/config.toml
under the [openai]
table:
[openai]
api_key = "sk-..."
AWS (S3 URLs)
Uses the standard AWS credential chain provided by boto3 – environment variables (AWS_ACCESS_KEY_ID
, AWS_SECRET_ACCESS_KEY
), the AWS CLI config, or an attached IAM role. Run aws configure
if unsure.
GCP (gs:// URLs)
Relies on Application Default Credentials. Run gcloud auth application-default login
or set the GOOGLE_APPLICATION_CREDENTIALS
environment variable pointing at a JSON key file.
If credentials are missing VXDF exits early with a clear message and a hint on how to configure them.
Shell completion
Install extra dependencies and activate once:
pip install vxdf[completion]
activate-global-python-argcomplete --user
Re-open your terminal and enjoy TAB-completion for vxdf
sub-commands and options.
VXDF is BSD-3-licensed. Contributions welcome!