You're Invited:Meet the Socket Team at BlackHat and DEF CON in Las Vegas, Aug 4-6.RSVP →

Book a Demo Install Sign in

evolvishub-dataloader

Package Overview

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

evolvishub-dataloader

A comprehensive data loading framework for Excel, CSV, and JSON files with async support and data validation

1.0.0

PyPI

Maintainers: 1

Evolvis AI - Empowering Innovation Through AI

Evolvishub Data Loader

A robust, asynchronous data loading and processing framework designed for handling various file formats and database integrations.

Company: Evolvis AI

Author: Alban Maxhuni, PhD
Email: a.maxhuni@evolvis.ai

Features

Multi-Format Support: Process Excel, CSV, JSON, and custom file formats
Asynchronous Processing: Built with Python's asyncio for efficient I/O operations
Configurable: YAML and INI configuration support
Database Integration: SQLite and PostgreSQL support
Error Handling: Comprehensive error handling and logging
File Management: Automatic file movement and organization
Notification System: Integrated notification system for process updates
Extensible: Easy to add new processors and validators

Installation

Clone the repository:

git clone https://github.com/yourusername/evolvishub-dataloader.git
cd evolvishub-dataloader

Create and activate a virtual environment:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Configuration

The framework supports both YAML and INI configuration formats. Configuration files should be placed in the config directory.

Example YAML Configuration

# Data Types Configuration
data_types:
  types: "inventory,sales,purchases,orders,custom"

# Directory Configuration
directories:
  root: "data"
  processed: "data/processed"
  failed: "data/failed"

# Database Configuration
database:
  path: "data/database.db"
  migrations: "migrations"
  backup: "backups"

# File Processing Configuration
processing:
  move_processed: true
  add_timestamp: true
  retry_attempts: 3
  max_file_size: 10485760  # 10MB

Usage

Basic Usage

from src.data_loader.generic_data_loader import GenericDataLoader
from src.data_loader.sqlite_adapter import SQLiteAdapter

async def main():
    # Initialize database adapter
    db = SQLiteAdapter()
    
    # Create data loader instance
    loader = GenericDataLoader(db)
    await loader.initialize()
    
    # Load data from a file
    results = await loader.load_data(
        source="path/to/your/file.xlsx",
        table_name="your_table_name"
    )
    
    # Process results
    for result in results:
        print(f"Status: {result['status']}")
        print(f"Records loaded: {result.get('records_loaded', 0)}")

if __name__ == "__main__":
    import asyncio
    asyncio.run(main())

Custom Processors

You can create custom processors for specific file formats:

from src.data_loader.generic_data_loader import DataProcessor

class CustomProcessor(DataProcessor):
    async def process(self, data):
        # Your custom processing logic here
        return processed_data

Validation

Add custom validation to your data loading process:

async def custom_validator(data):
    # Your validation logic here
    return True

results = await loader.load_data(
    source="path/to/your/file.xlsx",
    table_name="your_table_name",
    validator=custom_validator
)

Testing

Run the test suite:

PYTHONPATH=./ python -m pytest tests/ -v

Project Structure

evolvishub-dataloader/
├── config/
│   ├── data_loader.yaml
│   └── data_loader.ini
├── src/
│   └── data_loader/
│       ├── generic_data_loader.py
│       ├── sqlite_adapter.py
│       └── processors/
├── tests/
│   ├── test_generic_data_loader.py
│   ├── test_config_manager.py
│   └── test_specific_loaders.py
├── requirements.txt
└── README.md

Contributing

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For support, please open an issue in the GitHub repository or contact the maintainers.

Acknowledgments

Thanks to all contributors who have helped shape this project
Built with SQLAlchemy
Powered by pandas

Keywords

FAQs

What is evolvishub-dataloader?

Is evolvishub-dataloader well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

evolvishub-dataloader

Evolvishub Data Loader

Features

Installation

Configuration

Example YAML Configuration

Usage

Basic Usage

Custom Processors

Validation

Testing

Project Structure

Contributing

License

Support

Acknowledgments

Keywords

Related posts

Introducing License Overlays: Smarter License Management for Real-World Code

Introducing Rust Support in Socket