
Security News
Software Engineering Daily Podcast: Feross on AI, Open Source, and Supply Chain Risk
Socket CEO Feross Aboukhadijeh joins Software Engineering Daily to discuss modern software supply chain attacks and rising AI-driven security risks.
pycharter
Advanced tools
A Python package for data contract management with five core services: contract parsing, metadata storage, Pydantic generation, JSON Schema conversion, and runtime validation
Dynamically generate Pydantic models from JSON schemas with coercion and validation support
PyCharter is a powerful Python library that automatically converts JSON schemas into fully-functional Pydantic models. It fully supports the JSON Schema Draft 2020-12 standard, including all standard validation keywords (minLength, maxLength, pattern, enum, minimum, maximum, etc.), while also providing extensions for pre-validation coercion and post-validation checks. It handles nested objects, arrays, and custom validators, with all validation logic stored as data (not Python code). PyCharter also provides a complete data contract management system with versioning, metadata storage, and runtime validation capabilities.
pip install pycharter
pip install pycharter[api]
This installs FastAPI and Uvicorn for running the REST API server.
pip install pycharter[ui]
This installs the Python dependencies and pre-built UI static files (like Airflow).
After installation, you can immediately start the UI:
pycharter ui serve # Production mode (uses pre-built static files)
For development (if you have the source code):
cd ui
npm install # Install Node.js dependencies
pycharter ui dev # Development mode with hot reload
Note: When installed from pip, the UI works immediately without Node.js. For development, Node.js is required. See ui/INSTALLATION.md for detailed instructions.
from pycharter import from_dict
# Define your JSON schema
schema = {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"},
"email": {"type": "string"}
},
"required": ["name", "age"]
}
# Generate a Pydantic model
Person = from_dict(schema, "Person")
# Use it like any Pydantic model
person = Person(name="Alice", age=30, email="alice@example.com")
print(person.name) # Output: Alice
print(person.age) # Output: 30
PyCharter provides six core services that work together to support a complete data production journey, from contract specification to runtime validation. Each service plays a critical role in managing data contracts and ensuring data quality throughout your pipeline.
The typical data production workflow follows this path:
1. Data Contract Specification
β
2. Contract Parsing
β
3. Metadata Storage
β
4. Pydantic Model Generation
β
5. Runtime Validation
pycharter.contract_parser)Purpose: Reads and decomposes data contract files into structured metadata components.
When to Use: At the beginning of your data production journey, when you have data contract files (YAML or JSON) that need to be processed and understood.
How It Works:
schema, governance_rules, ownership, and metadataContractMetadata object that separates concerns and makes each component accessibleExample:
from pycharter import parse_contract_file, ContractMetadata
# Parse a contract file (YAML or JSON)
metadata = parse_contract_file("data_contract.yaml")
# Access decomposed components
schema = metadata.schema # JSON Schema definition
governance = metadata.governance_rules # Governance policies
ownership = metadata.ownership # Owner/team information
metadata_info = metadata.metadata # Additional metadata
versions = metadata.versions # Component versions
Contribution to Journey: The contract parser is the entry point that takes raw contract specifications and prepares them for downstream processing. It ensures that contracts are properly structured and that all components (schema, governance, ownership) are separated for independent handling.
pycharter.contract_builder)Purpose: Constructs consolidated data contracts from separate artifacts (schema, coercion rules, validation rules, metadata).
When to Use: When you have separate artifacts stored independently and need to combine them into a single consolidated contract for runtime validation or distribution.
How It Works:
Example:
from pycharter import build_contract, build_contract_from_store, ContractArtifacts
# Build from separate artifacts
artifacts = ContractArtifacts(
schema={"type": "object", "version": "1.0.0", "properties": {...}},
coercion_rules={"version": "1.0.0", "rules": {"age": "coerce_to_integer"}},
validation_rules={"version": "1.0.0", "rules": {"age": {"is_positive": {...}}}},
metadata={"version": "1.0.0", "description": "User contract"},
ownership={"owner": "data-team", "team": "engineering"},
)
contract = build_contract(artifacts)
# Contract now has:
# - schema with rules merged
# - metadata, ownership, governance_rules
# - versions tracking all components
# Or build from metadata store
contract = build_contract_from_store(store, "user_schema_v1")
# Use for validation
from pycharter import validate_with_contract
result = validate_with_contract(contract, {"name": "Alice", "age": "30"})
Contribution to Journey: The contract builder is the consolidation layer that combines separate artifacts (stored independently in the database) into a single contract artifact. This consolidated contract tracks all component versions and can be used for runtime validation, distribution, or archival purposes.
pycharter.metadata_store)Purpose: Manages persistent storage and retrieval of decomposed metadata in databases.
When to Use: After parsing contracts, when you need to store metadata components (schemas, governance rules, ownership) in a database for versioning, querying, and governance.
How It Works:
Available Implementations:
Example:
from pycharter import PostgresMetadataStore, parse_contract_file
# Parse contract
metadata = parse_contract_file("contract.yaml")
# Use PostgreSQL metadata store (or MongoDBMetadataStore, RedisMetadataStore, etc.)
store = PostgresMetadataStore(connection_string="postgresql://user:pass@localhost:5432/pycharter")
store.connect()
# Store decomposed components
schema_id = store.store_schema("user_schema", metadata.schema, version="1.0")
# Store metadata (including ownership and governance rules)
metadata_dict = {
"business_owners": ["data-team@example.com"],
"governance_rules": {"pii_rule": {"type": "encrypt"}}
}
store.store_metadata(schema_id, metadata_dict, "schema")
# Store coercion and validation rules
store.store_coercion_rules(schema_id, {"age": "coerce_to_integer"}, version="1.0")
store.store_validation_rules(schema_id, {"age": {"is_positive": {}}}, version="1.0")
# Retrieve later
stored_schema = store.get_schema(schema_id)
coercion_rules = store.get_coercion_rules(schema_id)
validation_rules = store.get_validation_rules(schema_id)
Contribution to Journey: The metadata store is the persistence layer that ensures contracts and their components are versioned, searchable, and accessible across your organization. It enables governance, audit trails, and schema evolution tracking.
See Configuration Guide for database setup and initialization instructions.
pycharter.pydantic_generator)Purpose: Dynamically generates fully-functional Pydantic models from JSON Schema definitions.
When to Use: After storing schemas (or directly from parsed contracts), when you need to generate Python models for type-safe data validation and processing.
How It Works:
Example:
from pycharter import from_dict, generate_model_file, MetadataStoreClient
# Option 1: Generate from parsed contract
metadata = parse_contract_file("contract.yaml")
UserModel = from_dict(metadata.schema, "User")
# Option 2: Generate from stored schema
client = MetadataStoreClient(...)
schema = client.get_schema("user_schema_v1")
UserModel = from_dict(schema, "User")
# Option 3: Generate and save to file
generate_model_file(schema, "user_model.py", "User")
Contribution to Journey: The Pydantic generator is the transformation engine that converts declarative JSON Schema definitions into executable Python models. It bridges the gap between contract specifications (data) and runtime validation (code), enabling type-safe data processing.
pycharter.json_schema_converter)Purpose: Converts existing Pydantic models back into JSON Schema format (reverse conversion).
When to Use: When you have existing Pydantic models and need to generate JSON Schema definitions, or when you want to round-trip between schemas and models.
How It Works:
Example:
from pycharter import to_dict, to_file, to_json
from pydantic import BaseModel
class Product(BaseModel):
name: str
price: float
in_stock: bool = True
# Convert to JSON Schema
schema = to_dict(Product)
json_string = to_json(Product)
to_file(Product, "product_schema.json")
# Now you can use the schema with other services
ProductModel = from_dict(schema, "Product") # Round-trip
Contribution to Journey: The JSON Schema converter enables bidirectional conversion between models and schemas. It's useful for:
pycharter.runtime_validator)Purpose: Lightweight validation utility for validating data against generated Pydantic models in production data pipelines.
When to Use: In your data processing scripts, ETL pipelines, API endpoints, or any place where you need to validate incoming data against contract specifications.
How It Works:
ValidationResult with validation status, validated data, and errorsTwo Validation Modes:
Database-Backed Validation (with metadata store):
validate_with_store(), validate_batch_with_store(), get_model_from_store()Contract-Based Validation (no database required):
validate_with_contract(), validate_batch_with_contract(), get_model_from_contract()Example - Database-Backed:
from pycharter import validate_with_store, InMemoryMetadataStore
# Store and validate with database
store = InMemoryMetadataStore()
store.connect()
# ... store schema, rules, etc. ...
# Validate using store
result = validate_with_store(store, "user_schema_v1", {"name": "Alice", "age": 30})
if result.is_valid:
print(f"Valid user: {result.data.name}")
Example - Contract-Based (No Database):
from pycharter import validate_with_contract, get_model_from_contract, validate
# Validate directly from contract file (simplest)
result = validate_with_contract(
"data/examples/book/book_contract.yaml",
{"isbn": "1234567890", "title": "Book", ...}
)
# Or get model once, validate multiple times (efficient)
BookModel = get_model_from_contract("book_contract.yaml")
result1 = validate(BookModel, data1)
result2 = validate(BookModel, data2)
# Or from dictionary
contract = {
"schema": {"type": "object", "properties": {...}},
"coercion_rules": {"rules": {...}},
"validation_rules": {"rules": {...}}
}
result = validate_with_contract(contract, data)
Contribution to Journey: The runtime validator is the enforcement layer that ensures data quality in production. It validates actual data against contract specifications, catching violations early and preventing bad data from propagating through your systems. It supports both database-backed workflows (for production systems with metadata stores) and contract-based workflows (for simpler use cases without database dependencies).
Here's how all six services work together in a complete data production journey:
from pycharter import (
parse_contract_file,
PostgresMetadataStore,
from_dict,
validate,
to_dict
)
# Step 1: Parse contract specification
metadata = parse_contract_file("user_contract.yaml")
# Step 2: Store metadata in database
store = PostgresMetadataStore(connection_string="postgresql://user:pass@localhost:5432/pycharter")
store.connect()
schema_id = store.store_schema("user", metadata.schema, version="1.0")
# Store metadata (including ownership and governance rules)
metadata_dict = {
"business_owners": ["data-team@example.com"],
"governance_rules": {"pii_rule": {"type": "encrypt"}}
}
store.store_metadata(schema_id, metadata_dict, "schema")
# Store coercion and validation rules
store.store_coercion_rules(schema_id, {"age": "coerce_to_integer"}, version="1.0")
store.store_validation_rules(schema_id, {"age": {"is_positive": {}}}, version="1.0")
# Step 3: Generate Pydantic model from stored schema
schema = store.get_schema(schema_id)
UserModel = from_dict(schema, "User")
# Step 4: (Optional) Convert model back to schema for documentation
schema_doc = to_dict(UserModel)
# Step 5: Validate data in production pipeline
def process_user_data(raw_data):
result = validate(UserModel, raw_data)
if result.is_valid:
# Process validated data
return result.data
else:
# Handle validation errors
raise ValueError(f"Invalid data: {result.errors}")
api/)Purpose: Expose all PyCharter services as REST API endpoints.
When to Use: When you need to use PyCharter from non-Python applications, microservices, or want to provide a web-based interface.
How It Works:
api/) as a separate applicationExample:
# Start the API server
pycharter-api
# Or with uvicorn
uvicorn api.main:app --reload
Endpoints:
POST /api/v1/contracts/parse - Parse a data contractPOST /api/v1/contracts/build - Build contract from storePOST /api/v1/metadata/schemas - Store a schemaGET /api/v1/metadata/schemas/{schema_id} - Get a schemaPOST /api/v1/schemas/generate - Generate Pydantic modelPOST /api/v1/validation/validate - Validate dataPOST /api/v1/validation/validate-batch - Batch validationDocumentation:
See api/README.md for complete API documentation.
| Service | Input | Output | Journey Stage |
|---|---|---|---|
| Contract Parser | Contract files (YAML/JSON) | ContractMetadata | Contract Specification β Parsing |
| Contract Builder | Separate artifacts or Store | Consolidated contract | Storage β Consolidation |
| Metadata Store | ContractMetadata | Stored metadata (DB) | Parsing β Storage |
| Pydantic Generator | JSON Schema | Pydantic models | Storage β Model Generation |
| JSON Schema Converter | Pydantic models | JSON Schema | (Bidirectional) |
| Runtime Validator | Pydantic models + Data | ValidationResult | Model Generation β Validation |
Each service is designed to be independent yet composable, allowing you to use them individually or together as part of a complete data contract management system.
from pycharter import from_dict, from_json, from_file
# From dictionary
schema = {
"type": "object",
"properties": {
"title": {"type": "string"},
"published": {"type": "boolean", "default": False}
}
}
Article = from_dict(schema, "Article")
# From JSON string
schema_json = '{"type": "object", "properties": {"name": {"type": "string"}}}'
User = from_json(schema_json, "User")
# From file
Product = from_file("product_schema.json", "Product")
from pycharter import from_dict
schema = {
"type": "object",
"properties": {
"name": {"type": "string"},
"address": {
"type": "object",
"properties": {
"street": {"type": "string"},
"city": {"type": "string"},
"zipcode": {"type": "string"}
}
}
}
}
Person = from_dict(schema, "Person")
person = Person(
name="Alice",
address={
"street": "123 Main St",
"city": "New York",
"zipcode": "10001"
}
)
print(person.address.city) # Output: New York
from pycharter import from_dict
schema = {
"type": "object",
"properties": {
"tags": {
"type": "array",
"items": {"type": "string"}
},
"items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"price": {"type": "number"}
}
}
}
}
}
Cart = from_dict(schema, "Cart")
cart = Cart(
tags=["python", "pydantic"],
items=[
{"name": "Apple", "price": 1.50},
{"name": "Banana", "price": 0.75}
]
)
print(cart.items[0].name) # Output: Apple
Charter supports coercion (pre-validation transformation) and validation (post-validation checks):
from pycharter import from_dict
schema = {
"type": "object",
"properties": {
"flight_number": {
"type": "integer",
"coercion": "coerce_to_integer" # Convert string/float to int
},
"destination": {
"type": "string",
"coercion": "coerce_to_string",
"validations": {
"min_length": {"threshold": 3},
"max_length": {"threshold": 3},
"no_capital_characters": None,
"only_allow": {"allowed_values": ["abc", "def", "ghi"]}
}
},
"distance": {
"type": "number",
"coercion": "coerce_to_float",
"validations": {
"greater_than_or_equal_to": {"threshold": 0}
}
}
}
}
Flight = from_dict(schema, "Flight")
# Coercion happens automatically
flight = Flight(
flight_number="123", # Coerced to int: 123
destination="abc", # Passes all validations
distance="100.5" # Coerced to float: 100.5
)
Charter supports all standard JSON Schema Draft 2020-12 validation keywords:
| Keyword | Type | Description | Example |
|---|---|---|---|
minLength | string | Minimum string length | {"minLength": 3} |
maxLength | string | Maximum string length | {"maxLength": 10} |
pattern | string | Regular expression pattern | {"pattern": "^[a-z]+$"} |
enum | any | Allowed values | {"enum": ["a", "b", "c"]} |
const | any | Single allowed value | {"const": "fixed"} |
minimum | number | Minimum value (inclusive) | {"minimum": 0} |
maximum | number | Maximum value (inclusive) | {"maximum": 100} |
exclusiveMinimum | number | Minimum value (exclusive) | {"exclusiveMinimum": 0} |
exclusiveMaximum | number | Maximum value (exclusive) | {"exclusiveMaximum": 100} |
multipleOf | number | Must be multiple of | {"multipleOf": 2} |
minItems | array | Minimum array length | {"minItems": 1} |
maxItems | array | Maximum array length | {"maxItems": 10} |
uniqueItems | array | Array items must be unique | {"uniqueItems": true} |
All schemas are validated against JSON Schema standard before processing, ensuring compliance.
| Coercion | Description |
|---|---|
coerce_to_string | Convert int, float, bool, datetime, dict, list to string |
coerce_to_integer | Convert float, string (numeric), bool, datetime to int |
coerce_to_float | Convert int, string (numeric), bool to float |
coerce_to_boolean | Convert int, string to bool |
coerce_to_datetime | Convert string (ISO format), timestamp to datetime |
coerce_to_date | Convert string (date format), datetime to date (date only, no time) |
coerce_to_uuid | Convert string to UUID |
coerce_to_lowercase | Convert string to lowercase |
coerce_to_uppercase | Convert string to uppercase |
coerce_to_stripped_string | Strip leading and trailing whitespace from string |
coerce_to_list | Convert single value to list [value] (preserves None) |
coerce_empty_to_null | Convert empty strings/lists/dicts to None (useful for nullable fields) |
| Validation | Description | Configuration |
|---|---|---|
min_length | Minimum length for strings/arrays | {"threshold": N} |
max_length | Maximum length for strings/arrays | {"threshold": N} |
only_allow | Only allow specific values | {"allowed_values": [...]} |
greater_than_or_equal_to | Numeric minimum | {"threshold": N} |
less_than_or_equal_to | Numeric maximum | {"threshold": N} |
is_positive | Value must be positive | {"threshold": 0} |
no_capital_characters | No uppercase letters | null |
no_special_characters | Only alphanumeric and spaces | null |
non_empty_string | String must not be empty | null |
matches_regex | String must match regex pattern | {"pattern": "..."} |
is_email | String must be valid email address | null |
is_url | String must be valid URL | null |
is_alphanumeric | Only alphanumeric characters (no spaces/special) | null |
is_numeric_string | String must be numeric (digits, optional decimal) | null |
is_unique | All items in array must be unique | null |
Note: Charter extensions (
coercionandvalidations) are optional and can be used alongside standard JSON Schema keywords. All validation logic is stored as data in the JSON schema, making it fully data-driven.
Extend Charter with your own coercion and validation functions:
from pycharter.shared.coercions import register_coercion
from pycharter.shared.validations import register_validation
# Register custom coercion
def coerce_to_uppercase(data):
if isinstance(data, str):
return data.upper()
return data
register_coercion("coerce_to_uppercase", coerce_to_uppercase)
# Register custom validation
def must_be_positive(threshold=0):
def _validate(value, info):
if value <= threshold:
raise ValueError(f"Value must be > {threshold}")
return value
return _validate
register_validation("must_be_positive", must_be_positive)
from_dict(schema: dict, model_name: str = "DynamicModel") - Create model from dictionaryfrom_json(json_string: str, model_name: str = "DynamicModel") - Create model from JSON stringfrom_file(file_path: str, model_name: str = None) - Create model from JSON filefrom_url(url: str, model_name: str = "DynamicModel") - Create model from URLschema_to_model(schema: dict, model_name: str = "DynamicModel") - Low-level model generatorCharter is designed to meet the following core requirements:
All schemas must abide by conventional JSON Schema syntax and qualify as valid JSON Schema:
jsonschema library for validation with graceful fallbackAll schema information and complex field validation logic is stored as data, not Python code:
"coercion": "coerce_to_integer""validations": {"min_length": {"threshold": 3}}{"coercion": "coerce_to_string", "validations": {"min_length": {"threshold": 3}}}Models are created dynamically at runtime from JSON schemas:
pydantic.create_model() to generate models on-the-flyfield_validator decoratorsFull support for nested object schemas and complex structures:
Custom fields can be added to JSON Schema to extend functionality:
coercion: Pre-validation type conversion (e.g., string β integer)validations: Post-validation custom rulesSupport for both standard and custom field validators:
validations field# Run setup script
./setup.sh
# Activate environment
source venv/bin/activate
# Run tests
pytest
make install-dev # Install package and dev dependencies
make test # Run tests
make format # Format code with black and isort
make lint # Run type checking with mypy
make check # Run all checks (format, lint, test)
# Run all tests
pytest
# Run with coverage
pytest --cov=pycharter --cov-report=html
# Run specific test file
pytest tests/test_converter.py
# Run tests matching a pattern
pytest -k "coercion"
# Update version in pyproject.toml
# Clean previous builds
make clean
# Build package
make build
# Test on TestPyPI
make publish-test
# Publish to PyPI
make publish
PyCharter is fully compliant with JSON Schema Draft 2020-12 standard:
coercion and validations) work alongside standard keywordsContributions are welcome! Please feel free to submit a Pull Request.
git checkout -b feature/amazing-feature)git commit -m 'Add some amazing feature')git push origin feature/amazing-feature)This project is licensed under the MIT License - see the LICENSE file for details.
Made with β€οΈ for the Python community
FAQs
A Python package for data contract management with five core services: contract parsing, metadata storage, Pydantic generation, JSON Schema conversion, and runtime validation
We found that pycharter demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago.Β It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
Socket CEO Feross Aboukhadijeh joins Software Engineering Daily to discuss modern software supply chain attacks and rising AI-driven security risks.

Security News
GitHub has revoked npm classic tokens for publishing; maintainers must migrate, but OpenJS warns OIDC trusted publishing still has risky gaps for critical projects.

Security News
Rustβs crates.io team is advancing an RFC to add a Security tab that surfaces RustSec vulnerability and unsoundness advisories directly on crate pages.