String Schema
A simple, LLM-friendly schema definition library for Python that converts intuitive string syntax into structured data schemas.
🚀 Quick Start
Installation
pip install string-schema
30-Second Example
from string_schema import string_to_json_schema
schema = string_to_json_schema("name:string, email:email, age:int?")
print(schema)
Create Pydantic Models Directly
from string_schema import string_to_model
UserModel = string_to_model("name:string, email:email, age:int?")
user = UserModel(name="Alice", email="alice@example.com")
print(user.model_dump_json())
🎯 What String Schema Does
String Schema takes human-readable text descriptions and converts them into structured schemas for data validation, extraction, and API documentation. Perfect for LLM data extraction, API development, and configuration validation.
Input: Human-readable string syntax
Output: JSON Schema, Pydantic models, or OpenAPI specifications
🚀 Core Functions & Use Cases
📝 Schema Conversion Matrix
Forward Conversions (Source → Target):
string_to_json_schema() | String syntax | JSON Schema dict | Main conversion - string to JSON Schema |
string_to_model() | String syntax | Pydantic model class | Direct path - string to Pydantic model |
string_to_model_code() | String syntax | Python code string | Code generation - for templates |
string_to_openapi() | String syntax | OpenAPI schema dict | Direct path - string to OpenAPI |
json_schema_to_model() | JSON Schema dict | Pydantic model class | When you already have JSON Schema |
json_schema_to_openapi() | JSON Schema dict | OpenAPI schema dict | When you already have JSON Schema |
Reverse Conversions (Target → Source):
model_to_string() | Pydantic model | String syntax | Schema introspection - model to string |
model_to_json_schema() | Pydantic model | JSON Schema dict | Export - model to JSON Schema |
json_schema_to_string() | JSON Schema dict | String syntax | Migration - JSON Schema to string |
openapi_to_string() | OpenAPI schema | String syntax | Import - OpenAPI to string |
openapi_to_json_schema() | OpenAPI schema | JSON Schema dict | Conversion - OpenAPI to JSON Schema |
🔍 Data Validation Functions
validate_to_dict() | Data + schema | Validated dict | API responses - clean dicts |
validate_to_model() | Data + schema | Pydantic model | Business logic - typed models |
validate_string_syntax() | String syntax | Validation result | Check syntax and get feedback |
🎨 Function Decorators
@returns_dict() | Auto-validate to dict | Validated dict | API endpoints |
@returns_model() | Auto-validate to model | Pydantic model | Business logic |
🔧 Utility Functions
get_model_info() | Model introspection | Model details dict | Debugging & analysis |
validate_schema_compatibility() | Schema validation | Compatibility info | Schema validation |
🎯 Key Scenarios
- 🤖 LLM Data Extraction: Define extraction schemas that LLMs can easily follow
- 🔧 API Development: Generate Pydantic models and OpenAPI docs from simple syntax
- ✅ Data Validation: Create robust validation schemas with minimal code
- 📋 Configuration: Define and validate application configuration schemas
- 🔄 Data Transformation: Convert between different schema formats
✅ Validate Data
from string_schema import validate_to_dict, validate_to_model
raw_data = {
"name": "John Doe",
"email": "john@example.com",
"age": "25",
"extra_field": "ignored"
}
user_dict = validate_to_dict(raw_data, "name:string, email:email, age:int?")
print(user_dict)
user_model = validate_to_model(raw_data, "name:string, email:email, age:int?")
print(user_model.name)
print(user_model.age)
🎨 Function Decorators
from string_schema import returns_dict, returns_model
import uuid
@returns_dict("id:string, name:string, active:bool")
def create_user(name):
return {"id": str(uuid.uuid4()), "name": name, "active": True, "extra": "ignored"}
@returns_model("name:string, email:string")
def process_user(raw_input):
return {"name": raw_input["name"], "email": raw_input["email"], "junk": "data"}
user_dict = create_user("Alice")
user_model = process_user({"name": "Bob", "email": "bob@test.com"})
print(user_model.name)
🌐 FastAPI Integration
from string_schema import string_to_model, returns_dict
UserRequest = string_to_model("name:string, email:email")
@app.post("/users")
@returns_dict("id:int, name:string, email:string")
def create_user_endpoint(user: UserRequest):
return {"id": 123, "name": user.name, "email": user.email}
Features: Arrays [{name:string}]
, nested objects {profile:{bio:text?}}
, enums, constraints, decorators.
📖 Complete Documentation
🎓 More Examples
🌱 Simple Example - Basic User Data
from string_schema import string_to_json_schema
schema = string_to_json_schema("""
name:string(min=1, max=100),
email:email,
age:int(0, 120)?,
active:bool
""")
print(schema)
🌳 Moderately Complex Example - Product Data
schema = string_to_json_schema("""
{
id:uuid,
name:string,
price:number(min=0),
tags:[string]?,
inventory:{
in_stock:bool,
quantity:int?
}
}
""")
print(schema)
📚 For more examples and advanced syntax, see our detailed documentation
🔄 Output Formats & Results
🐍 Pydantic Models (Python Classes)
from string_schema import string_to_model
UserModel = string_to_model("name:string, email:email, active:bool")
user = UserModel(name="John Doe", email="john@example.com", active=True)
print(user.model_dump_json())
🔧 Code Generation (For Templates & Tools)
from string_schema import string_to_model_code
code = string_to_model_code("User", "name:string, email:email, active:bool")
print(code)
with open('models.py', 'w') as f:
f.write(code)
🌐 OpenAPI Schemas (API Documentation)
from string_schema import string_to_openapi
openapi_schema = string_to_openapi("name:string, email:email")
print(openapi_schema)
🔄 Reverse Conversions (Universal Schema Converter)
String Schema provides complete bidirectional conversion between all schema formats!
⚠️ Information Loss Notice: Reverse conversions (from JSON Schema/OpenAPI/Pydantic back to string syntax) may lose some information due to format differences. However, the resulting schemas are designed to cover the most common use cases and maintain functional equivalence for typical validation scenarios.
🔍 Schema Introspection
from string_schema import model_to_string, string_to_model
UserModel = string_to_model("name:string, email:email, active:bool")
schema_string = model_to_string(UserModel)
print(f"Model schema: {schema_string}")
📦 Migration & Import
from string_schema import json_schema_to_string
json_schema = {
"type": "object",
"properties": {
"name": {"type": "string"},
"email": {"type": "string", "format": "email"},
"age": {"type": "integer"}
},
"required": ["name", "email"]
}
simple_syntax = json_schema_to_string(json_schema)
print(f"Converted: {simple_syntax}")
🔧 Schema Comparison & Analysis
from string_schema import string_to_model, model_to_string
UserV1 = string_to_model("name:string, email:email")
UserV2 = string_to_model("name:string, email:email, active:bool")
v1_str = model_to_string(UserV1)
v2_str = model_to_string(UserV2)
print("Schema changes:")
print(f"V1: {v1_str}")
print(f"V2: {v2_str}")
🎨 String Syntax Reference
Basic Types
string
, int
, number
, bool
→ Basic data types
email
, url
, datetime
, date
, uuid
, phone
→ Special validated types
Field Modifiers
field_name:type
→ Required field
field_name:type?
→ Optional field
field_name:type(constraints)
→ Field with validation
Common Patterns
string(min=1, max=100)
→ Length constraints
int(0, 120)
→ Range constraints
[string]
→ Simple arrays
[{name:string, email:email}]
→ Object arrays
status:enum(active, inactive)
→ Enum values
id:string|uuid
→ Union types
📖 Complete syntax guide: See docs/string-syntax.md for full reference
✅ Validation
from string_schema import validate_string_syntax
result = validate_string_syntax("name:string, email:email, age:int?")
print(f"Valid: {result['valid']}")
print(f"Features used: {result['features_used']}")
print(f"Field count: {len(result['parsed_fields'])}")
bad_result = validate_string_syntax("name:invalid_type")
print(f"Valid: {bad_result['valid']}")
print(f"Errors: {bad_result['errors']}")
🏗️ Common Use Cases
from string_schema import string_to_json_schema
schema = string_to_json_schema("company:string, employees:[{name:string, email:email}], founded:int?")
🔧 FastAPI Development
from string_schema import string_to_model
UserModel = string_to_model("name:string, email:email")
@app.post("/users/")
async def create_user(user: UserModel):
return {"id": 123, "name": user.name, "email": user.email}
🏗️ Code Generation & Templates
from string_schema import string_to_model_code
code = string_to_model_code("User", "name:string, email:email")
print(code)
with open('user_model.py', 'w') as f:
f.write(code)
📋 Configuration Validation
from string_schema import string_to_json_schema
config_schema = string_to_json_schema("database:{host:string, port:int}, debug:bool")
📚 Documentation
📋 Example Schemas
Ready-to-use schema examples are available in the examples/
directory:
from string_schema.examples.presets import user_schema, product_schema
from string_schema.examples.recipes import create_ecommerce_product_schema
user_schema = string_to_json_schema("name:string, email:email, age:int?")
🧪 Testing
The library includes comprehensive tests covering all functionality:
pip install pytest
pytest tests/
🤝 Contributing
Contributions are welcome! The codebase is well-organized and documented:
string_schema/
├── core/ # Core functionality (fields, builders, validators)
├── parsing/ # String parsing and syntax
├── integrations/ # Pydantic, JSON Schema, OpenAPI
└── examples/ # Built-in schemas and recipes
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
String Schema - Making data validation simple, intuitive, and LLM-friendly! 🚀