You're Invited:Meet the Socket Team at BlackHat and DEF CON in Las Vegas, Aug 4-6.RSVP
Socket
Book a DemoInstallSign in
Socket

crdb-dump

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

crdb-dump

A CLI tool to export and import schema definitions and data from CockroachDB in SQL, JSON, YAML, or chunked CSV formats.

0.3.0
pipPyPI
Maintainers
1

PyPI version Python versions License Build status

crdb-dump

A feature-rich CLI for exporting and importing CockroachDB schemas and data. Includes support for parallel chunked exports, manifest checksums, BYTES/UUID/ARRAY types, permission introspection, secure resumable imports, S3-compatible storage (MinIO, Cohesity), region-aware filtering, and automatic retry logic.

🚀 Features

  • ✅ Schema export: tables, views, sequences, enums
  • ✅ Data export: CSV or SQL with chunking, gzip, and ordering
  • ✅ Types: handles BYTES, UUIDs, STRING[], TIMESTAMP, enums
  • ✅ Schema output formats: sql, json, yaml
  • ✅ Resumable COPY-based imports with chunk-level tracking
  • ✅ Permission exports: roles, grants, role memberships
  • ✅ Parallel loading (--parallel-load) and manifest verification
  • ✅ Dry-run for schema or chunk loading
  • ✅ TLS and insecure auth supported
  • ✅ Schema diff support (--diff)
  • ✅ Full logging via logs/crdb_dump.log
  • ✅ Automatic retry logic with exponential backoff for transient failures
  • ✅ Fault-tolerant, resumable imports with --resume-log or --resume-log-dir
  • ✅ Region-aware export/import via --region
  • ✅ S3-compatible support (--use-s3) with MinIO, Cohesity, or AWS
  • ✅ CSV header validation (--validate-csv)
  • ✅ Python-based S3 bucket creation (via boto3) for MinIO

📦 Installation

pip install crdb-dump

🧪 Local Testing

./test-local.sh

This script will:

  • Start a multi-region demo CockroachDB cluster
  • Create test schema + data
  • Export schema and chunked data (CSV)
  • Verify chunk checksums
  • Dry-run and real import with retry/resume
  • Upload chunks to MinIO (S3-compatible)
  • Download and verify import from S3
  • Use Python (boto3) to create S3 buckets

🔧 CLI Overview

crdb-dump --help
crdb-dump export --help
crdb-dump load --help

Example usage:

crdb-dump export --db=mydb --data --per-table
crdb-dump load --db=mydb --schema=... --data-dir=... --resume-log=resume.json

🔐 Connection

export CRDB_URL="cockroachdb://root@localhost:26257/defaultdb?sslmode=disable"
# or
export CRDB_URL="postgresql://root@localhost:26257/defaultdb?sslmode=disable"

Alternatively:

--db mydb --host localhost --certs-dir ~/certs

Use --print-connection to verify resolved URL.

🏗 Export Options

crdb-dump export \
  --db=mydb \
  --per-table \
  --data \
  --data-format=csv \
  --chunk-size=1000 \
  --data-order=id \
  --data-compress \
  --data-parallel \
  --verify \
  --include-permissions \
  --archive

Schema Output

OptionDescription
--per-tableOne file per object (e.g., table_users.sql)
--formatOutput format: sql, json, yaml
--diffShow schema diff vs previous .sql file
--tablesComma-separated FQ names to include
--exclude-tablesSkip specific FQ table names
--include-permissionsExport roles, grants, and memberships
--regionOnly export tables matching this region

Data Export

OptionDescription
--dataEnable data export
--data-formatFormat: csv or sql
--chunk-sizeNumber of rows per chunk
--data-splitOutput one file per table
--data-compressOutput .csv.gz
--data-orderOrder rows by column(s)
--data-order-descUse descending order
--data-parallelParallel export across tables
--verifyVerify chunk checksums
--regionFilter tables by region in manifests
--use-s3Upload exported chunks to S3
--s3-bucketS3 bucket name
--s3-prefixKey prefix under which to store chunks
--s3-endpointS3-compatible endpoint URL
--s3-access-keyS3 access key (can use env)
--s3-secret-keyS3 secret key (can use env)

⛓ Import Options

crdb-dump load \
  --db=mydb \
  --schema=crdb_dump_output/mydb/mydb_schema.sql \
  --data-dir=crdb_dump_output/mydb \
  --resume-log=resume.json \
  --validate-csv \
  --parallel-load \
  --print-connection
OptionDescription
--schema.sql file to apply
--data-dirFolder containing chunked CSV + manifests
--resume-logTrack loaded chunks in a single JSON file
--resume-log-dirPer-table resume logs (e.g. resume/users.json)
--validate-csvEnsure chunk headers match DB schema
--parallel-loadLoad chunks in parallel
--regionOnly import chunks from matching region
--dry-runPrint actions but don't execute
--use-s3Download chunks from S3
--s3-bucketS3 bucket name
--s3-prefixPath prefix inside the bucket
--s3-endpointS3-compatible endpoint (MinIO, Cohesity)
--s3-access-keyS3 access key
--s3-secret-keyS3 secret key

🔄 Fault Tolerance & Resume Support

  • ✅ Retries failed operations with exponential backoff

  • ✅ Resumable imports:

    • --resume-log (single file)
    • --resume-log-dir (per-table)
    • --resume-strict (abort on failure)

Writes resume state after each successful chunk. Restarts are safe and idempotent.

☁️ S3 / MinIO / Cohesity Example

crdb-dump export \
  --db=mydb \
  --per-table \
  --data \
  --chunk-size=1000 \
  --data-format=csv \
  --use-s3 \
  --s3-bucket=crdb-test-bucket \
  --s3-endpoint=http://localhost:9000 \
  --s3-access-key=minioadmin \
  --s3-secret-key=minioadmin \
  --s3-prefix=test1/ \
  --out-dir=crdb_dump_output

crdb-dump load \
  --db=mydb \
  --data-dir=crdb_dump_output/mydb \
  --resume-log-dir=resume/ \
  --parallel-load \
  --validate-csv \
  --use-s3 \
  --s3-bucket=crdb-test-bucket \
  --s3-endpoint=http://localhost:9000 \
  --s3-access-key=minioadmin \
  --s3-secret-key=minioadmin \
  --s3-prefix=test1/

🔍 Schema Diff Example

crdb-dump export --db=mydb --diff=old_schema.sql

Output:

crdb_dump_output/mydb/mydb_schema.diff

🧪 Testing

pytest -m unit
pytest -m integration
./test-local.sh

❤️ Contributing

Pull requests welcome! Star ⭐ the repo, file issues, or request features at:

👉 https://github.com/viragtripathi/crdb-dump/issues

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts