You're Invited:Meet the Socket Team at BlackHat and DEF CON in Las Vegas, Aug 4-6.RSVP
Socket
Book a DemoInstallSign in
Socket

evolvishub-dataloader

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

evolvishub-dataloader

A comprehensive data loading framework for Excel, CSV, and JSON files with async support and data validation

1.0.0
pipPyPI
Maintainers
1
Evolvishub Logo

Evolvis AI - Empowering Innovation Through AI

Evolvishub Data Loader

A robust, asynchronous data loading and processing framework designed for handling various file formats and database integrations.

Company: Evolvis AI

Author: Alban Maxhuni, PhD
Email: a.maxhuni@evolvis.ai

Features

  • Multi-Format Support: Process Excel, CSV, JSON, and custom file formats
  • Asynchronous Processing: Built with Python's asyncio for efficient I/O operations
  • Configurable: YAML and INI configuration support
  • Database Integration: SQLite and PostgreSQL support
  • Error Handling: Comprehensive error handling and logging
  • File Management: Automatic file movement and organization
  • Notification System: Integrated notification system for process updates
  • Extensible: Easy to add new processors and validators

Installation

  • Clone the repository:
git clone https://github.com/yourusername/evolvishub-dataloader.git
cd evolvishub-dataloader
  • Create and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
  • Install dependencies:
pip install -r requirements.txt

Configuration

The framework supports both YAML and INI configuration formats. Configuration files should be placed in the config directory.

Example YAML Configuration

# Data Types Configuration
data_types:
  types: "inventory,sales,purchases,orders,custom"

# Directory Configuration
directories:
  root: "data"
  processed: "data/processed"
  failed: "data/failed"

# Database Configuration
database:
  path: "data/database.db"
  migrations: "migrations"
  backup: "backups"

# File Processing Configuration
processing:
  move_processed: true
  add_timestamp: true
  retry_attempts: 3
  max_file_size: 10485760  # 10MB

Usage

Basic Usage

from src.data_loader.generic_data_loader import GenericDataLoader
from src.data_loader.sqlite_adapter import SQLiteAdapter

async def main():
    # Initialize database adapter
    db = SQLiteAdapter()
    
    # Create data loader instance
    loader = GenericDataLoader(db)
    await loader.initialize()
    
    # Load data from a file
    results = await loader.load_data(
        source="path/to/your/file.xlsx",
        table_name="your_table_name"
    )
    
    # Process results
    for result in results:
        print(f"Status: {result['status']}")
        print(f"Records loaded: {result.get('records_loaded', 0)}")

if __name__ == "__main__":
    import asyncio
    asyncio.run(main())

Custom Processors

You can create custom processors for specific file formats:

from src.data_loader.generic_data_loader import DataProcessor

class CustomProcessor(DataProcessor):
    async def process(self, data):
        # Your custom processing logic here
        return processed_data

Validation

Add custom validation to your data loading process:

async def custom_validator(data):
    # Your validation logic here
    return True

results = await loader.load_data(
    source="path/to/your/file.xlsx",
    table_name="your_table_name",
    validator=custom_validator
)

Testing

Run the test suite:

PYTHONPATH=./ python -m pytest tests/ -v

Project Structure

evolvishub-dataloader/
├── config/
│   ├── data_loader.yaml
│   └── data_loader.ini
├── src/
│   └── data_loader/
│       ├── generic_data_loader.py
│       ├── sqlite_adapter.py
│       └── processors/
├── tests/
│   ├── test_generic_data_loader.py
│   ├── test_config_manager.py
│   └── test_specific_loaders.py
├── requirements.txt
└── README.md

Contributing

  • Fork the repository
  • Create your feature branch (git checkout -b feature/amazing-feature)
  • Commit your changes (git commit -m 'Add some amazing feature')
  • Push to the branch (git push origin feature/amazing-feature)
  • Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For support, please open an issue in the GitHub repository or contact the maintainers.

Acknowledgments

  • Thanks to all contributors who have helped shape this project
  • Built with SQLAlchemy
  • Powered by pandas

Keywords

data-loader

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts