You're Invited:Meet the Socket Team at BlackHat and DEF CON in Las Vegas, Aug 4-6.RSVP →

Book a Demo Install Sign in

sqlalchemy-psql-upsert

Package Overview

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

sqlalchemy-psql-upsert

PostgreSQL upsert library with intelligent conflict resolution and multi-threaded processing

0.9.1

PyPI

Maintainers: 1

SQLAlchemy PostgreSQL Upsert

A Python library for intelligent PostgreSQL upsert operations with advanced conflict resolution and multi-threaded processing.

🚀 Features

Multi-constraint conflict detection: Automatically handles primary key, unique constraints, and composite constraints
Smart conflict filtering: Removes rows that would conflict with multiple existing records
Multi-threaded processing: Parallel chunk processing for large datasets
Configurable batch sizes: Optimize memory usage and processing speed
Schema validation: Automatic table and column validation before operations
Comprehensive error handling: Detailed logging and error reporting

📦 Installation

Using Poetry (Recommended)

poetry install sqlalchemy_psql_upsert

Using pip

pip install -e sqlalchemy_psql_upsert

🛠️ Quick Start

Basic Usage

import pandas as pd
from sqlalchemy_psql_upsert import PostgresqlUpsert
from sqlalchemy_psql_upsert.config import PgConfig

# Configure database connection
config = PgConfig()  # Loads from environment variables
upserter = PostgresqlUpsert(config=config)

# Prepare your data
df = pd.DataFrame({
    'id': [1, 2, 3],
    'name': ['Alice', 'Bob', 'Charlie'],
    'email': ['alice@example.com', 'bob@example.com', 'charlie@example.com']
})

# Perform upsert
success = upserter.upsert_dataframe(
    dataframe=df,
    table_name='users',
    schema='public',
    chunk_size=10000,
    max_workers=4
)

Advanced Configuration

from sqlalchemy import create_engine

# Using custom SQLAlchemy engine
engine = create_engine('postgresql://user:pass@localhost:5432/mydb')
upserter = PostgresqlUpsert(engine=engine, debug=True)

# Custom upsert with options
upserter.upsert_dataframe(
    dataframe=large_df,
    table_name='products',
    schema='inventory',
    chunk_size=5000,           # Smaller chunks for memory efficiency
    max_workers=8,             # More workers for better parallelism
    remove_multi_conflict_rows=True  # Remove problematic rows
)

⚙️ Configuration

Environment Variables

Create a .env file or set the following environment variables:

# PostgreSQL Configuration
PGSQL_HOST=localhost
PGSQL_PORT=5432
PGSQL_USER=your_username
PGSQL_PASS=your_password
PGSQL_NAME=your_database

Configuration Class

from sqlalchemy_psql_upsert.config import PgConfig

# Default configuration from environment
config = PgConfig()

# Manual configuration
config = PgConfig(
    host="localhost",
    port="5432",
    user="myuser",
    password="mypass",
    dbname="mydb"
)

print(config.uri())  # postgresql+psycopg2://myuser:mypass@localhost:5432/mydb

🔍 How It Works

Constraint Detection

The library automatically analyzes your target table to identify:

Primary key constraints
Unique constraints
Composite unique constraints

Conflict Resolution Process

Constraint Analysis: Identifies all relevant constraints on the target table
Conflict Detection: For each constraint, finds DataFrame rows that would conflict with existing data
Multi-Conflict Filtering: Removes rows that would match multiple existing records (ambiguous conflicts)
Intelligent Upsert: Uses PostgreSQL's ON CONFLICT clause with appropriate constraint targeting

Example Conflict Scenarios

Consider a table with these constraints:

Primary key: id
Unique constraint: email
Composite unique constraint: (doc_type, doc_number)

# This row conflicts on 'id' only - will be upserted
{'id': 1, 'email': 'new@example.com', 'doc_type': 'CPF', 'doc_number': '123'}

# This row conflicts on both 'id' and 'email' - will be removed
{'id': 1, 'email': 'existing@example.com', 'doc_type': 'RG', 'doc_number': '456'}

🚨 Limitations & Considerations

Current Limitations

PostgreSQL only: Currently supports PostgreSQL databases exclusively
Memory usage: Large datasets are processed in memory (chunked processing helps)
Complex constraints: Some exotic PostgreSQL constraint types may not be fully supported
Transaction scope: Each chunk is processed in its own transaction

Best Practices

Chunk sizing: Start with 10,000 rows per chunk, adjust based on your data and hardware
Worker count: Use 2-4 workers per CPU core, but test with your specific workload
Memory monitoring: Monitor memory usage with large datasets
Index considerations: Ensure proper indexing on conflict columns for optimal performance

🤝 Contributing

We welcome contributions! Here's how to get started:

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Make your changes and add tests
Run the test suite: pytest tests/ -v
Submit a pull request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙋 Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Documentation: Check the docstrings and test files for detailed usage examples

Keywords

FAQs

What is sqlalchemy-psql-upsert?

Is sqlalchemy-psql-upsert well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

sqlalchemy-psql-upsert

SQLAlchemy PostgreSQL Upsert

🚀 Features

📦 Installation

Using Poetry (Recommended)

Using pip

🛠️ Quick Start

Basic Usage

Advanced Configuration

⚙️ Configuration

Environment Variables

Configuration Class

🔍 How It Works

Constraint Detection

Conflict Resolution Process

Example Conflict Scenarios

🚨 Limitations & Considerations

Current Limitations

Best Practices

🤝 Contributing

📝 License

🙋 Support

Keywords

Related posts

npm Phishing Email Targets Developers with Typosquatted Domain

Knip Hits 500 Releases with v5.62.0, Improving TypeScript Config Detection and Plugin Integrations