Cerevox - The Data Layer 🧠 ⚡
Parse documents with enterprise-grade reliability
AI-powered • Highest Accuracy • Vector DB ready
Official Python SDK for Lexa - Parse documents into structured data
🎯 Perfect for: RAG applications, document analysis, data extraction, and vector database preparation
📦 Installation
pip install cerevox
📋 Requirements
🚀 Quick Start
Basic Usage
from cerevox import Lexa
client = Lexa(api_key="your-api-key")
documents = client.parse(["document.pdf"])
print(f"Extracted {len(documents[0].content)} characters")
print(f"Found {len(documents[0].tables)} tables")
Async Processing (Recommended)
import asyncio
from cerevox import AsyncLexa
async def main():
async with AsyncLexa(api_key="your-api-key") as client:
documents = await client.parse(["document.pdf", "report.docx"])
chunks = documents.get_all_text_chunks(target_size=500)
print(f"Ready for embedding: {len(chunks)} chunks")
asyncio.run(main())
✨ Features
🚀 Performance & Scale
- 10x Faster than traditional solutions
- Native Async Support with concurrent processing
- Enterprise-grade reliability with automatic retries
- SOTA Accuracy with cutting-edge ML models
- Advanced Table Extraction preserving structure and formatting
- 12+ File Formats including PDF, DOCX, PPTX, HTML, and more
🔗 Integration Ready
- Vector Database Optimized chunks for RAG applications
- 7+ Cloud Storage integrations (S3, SharePoint, Google Drive, etc.)
- Framework Agnostic works with Django, Flask, FastAPI
- Rich Metadata extraction including images, formatting, and structure
📋 Examples
Explore comprehensive examples in the examples/
directory:
🚀 Run Examples
git clone https://github.com/CerevoxAI/cerevox-python.git
cd cerevox-python
export CEREVOX_API_KEY="your-api-key"
python examples/lexa_examples.py
python examples/vector_db_preparation.py
python examples/async_examples.py
python examples/document_examples.py
python examples/cloud_integrations.py
📚 Documentation
📖 Guides & Tutorials
🔗 External Resources
🤝 Contributing
We welcome contributions! Please see our Contributing Guide for details.
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
📖 Resources
|
💬 Get Help
|
🐛 Issues
|
⭐ Star us on GitHub if Cerevox helped your project!
Made with ❤️ by the Cerevox team
Happy Parsing 🔍 ✨