
Research
/Security News
Critical Vulnerability in NestJS Devtools: Localhost RCE via Sandbox Escape
A flawed sandbox in @nestjs/devtools-integration lets attackers run code on your machine via CSRF, leading to full Remote Code Execution (RCE).
A simple, LLM-friendly schema definition library for converting string syntax to structured schemas
A simple, LLM-friendly schema definition library for Python that converts intuitive string syntax into structured data schemas.
pip install string-schema
from string_schema import string_to_json_schema
# Define a schema with simple, readable syntax
schema = string_to_json_schema("name:string, email:email, age:int?")
# Get a complete JSON Schema ready for validation
print(schema)
# {
# "type": "object",
# "properties": {
# "name": {"type": "string"},
# "email": {"type": "string", "format": "email"},
# "age": {"type": "integer"}
# },
# "required": ["name", "email"]
# }
from string_schema import string_to_model
# Create Pydantic model from string syntax
UserModel = string_to_model("name:string, email:email, age:int?")
# Use immediately
user = UserModel(name="Alice", email="alice@example.com")
print(user.model_dump_json()) # {"name": "Alice", "email": "alice@example.com"}
String Schema takes human-readable text descriptions and converts them into structured schemas for data validation, extraction, and API documentation. Perfect for LLM data extraction, API development, and configuration validation.
Input: Human-readable string syntax Output: JSON Schema, Pydantic models, or OpenAPI specifications
Forward Conversions (Source → Target):
Function | Input | Output | Use Case |
---|---|---|---|
string_to_json_schema() | String syntax | JSON Schema dict | Main conversion - string to JSON Schema |
string_to_model() | String syntax | Pydantic model class | Direct path - string to Pydantic model |
string_to_model_code() | String syntax | Python code string | Code generation - for templates |
string_to_openapi() | String syntax | OpenAPI schema dict | Direct path - string to OpenAPI |
json_schema_to_model() | JSON Schema dict | Pydantic model class | When you already have JSON Schema |
json_schema_to_openapi() | JSON Schema dict | OpenAPI schema dict | When you already have JSON Schema |
Reverse Conversions (Target → Source):
Function | Input | Output | Use Case |
---|---|---|---|
model_to_string() | Pydantic model | String syntax | Schema introspection - model to string |
model_to_json_schema() | Pydantic model | JSON Schema dict | Export - model to JSON Schema |
json_schema_to_string() | JSON Schema dict | String syntax | Migration - JSON Schema to string |
openapi_to_string() | OpenAPI schema | String syntax | Import - OpenAPI to string |
openapi_to_json_schema() | OpenAPI schema | JSON Schema dict | Conversion - OpenAPI to JSON Schema |
Function | Input | Output | Use Case |
---|---|---|---|
validate_to_dict() | Data + schema | Validated dict | API responses - clean dicts |
validate_to_model() | Data + schema | Pydantic model | Business logic - typed models |
validate_string_syntax() | String syntax | Validation result | Check syntax and get feedback |
Decorator | Purpose | Returns | Use Case |
---|---|---|---|
@returns_dict() | Auto-validate to dict | Validated dict | API endpoints |
@returns_model() | Auto-validate to model | Pydantic model | Business logic |
Function | Purpose | Returns | Use Case |
---|---|---|---|
get_model_info() | Model introspection | Model details dict | Debugging & analysis |
validate_schema_compatibility() | Schema validation | Compatibility info | Schema validation |
from string_schema import validate_to_dict, validate_to_model
# Example raw data (from API, user input, etc.)
raw_data = {
"name": "John Doe",
"email": "john@example.com",
"age": "25", # String that needs conversion
"extra_field": "ignored" # Will be filtered out
}
# Validate to clean dictionaries (perfect for API responses)
user_dict = validate_to_dict(raw_data, "name:string, email:email, age:int?")
print(user_dict) # {"name": "John Doe", "email": "john@example.com", "age": 25}
# Validate to typed models (perfect for business logic)
user_model = validate_to_model(raw_data, "name:string, email:email, age:int?")
print(user_model.name) # "John Doe" - Full type safety
print(user_model.age) # 25 - Converted to int
from string_schema import returns_dict, returns_model
import uuid
# Auto-validate function returns to dicts
@returns_dict("id:string, name:string, active:bool")
def create_user(name):
# Input: "Alice"
# Function returns unvalidated dict
return {"id": str(uuid.uuid4()), "name": name, "active": True, "extra": "ignored"}
# Output: {"id": "123e4567-...", "name": "Alice", "active": True}
# Note: 'extra' field filtered out, types validated
# Auto-validate function returns to models
@returns_model("name:string, email:string")
def process_user(raw_input):
# Input: {"name": "Bob", "email": "bob@test.com", "junk": "data"}
# Function returns unvalidated dict
return {"name": raw_input["name"], "email": raw_input["email"], "junk": "data"}
# Output: UserModel(name="Bob", email="bob@test.com")
# Note: Returns typed Pydantic model, 'junk' field filtered out
# Usage examples:
user_dict = create_user("Alice") # Returns validated dict
user_model = process_user({"name": "Bob", "email": "bob@test.com"}) # Returns Pydantic model
print(user_model.name) # "Bob" - Full type safety
from string_schema import string_to_model, returns_dict
# Create models for FastAPI
UserRequest = string_to_model("name:string, email:email")
@app.post("/users")
@returns_dict("id:int, name:string, email:string")
def create_user_endpoint(user: UserRequest):
return {"id": 123, "name": user.name, "email": user.email}
Features: Arrays [{name:string}]
, nested objects {profile:{bio:text?}}
, enums, constraints, decorators.
from string_schema import string_to_json_schema
# Define schema using intuitive string syntax
schema = string_to_json_schema("""
name:string(min=1, max=100),
email:email,
age:int(0, 120)?,
active:bool
""")
# Result: JSON Schema dictionary ready for validation
print(schema)
# {
# "type": "object",
# "properties": {
# "name": {"type": "string", "minLength": 1, "maxLength": 100},
# "email": {"type": "string", "format": "email"},
# "age": {"type": "integer", "minimum": 0, "maximum": 120},
# "active": {"type": "boolean"}
# },
# "required": ["name", "email", "active"]
# }
# Product with arrays and nested objects
schema = string_to_json_schema("""
{
id:uuid,
name:string,
price:number(min=0),
tags:[string]?,
inventory:{
in_stock:bool,
quantity:int?
}
}
""")
# Result: Complete JSON Schema with nested objects and arrays
print(schema) # Full JSON Schema ready for validation
📚 For more examples and advanced syntax, see our detailed documentation
from string_schema import string_to_model
# Direct conversion: String syntax → Pydantic model
UserModel = string_to_model("name:string, email:email, active:bool")
# Use the model for validation
user = UserModel(name="John Doe", email="john@example.com", active=True)
print(user.model_dump_json()) # {"name": "John Doe", "email": "john@example.com", "active": true}
from string_schema import string_to_model_code
# Generate Pydantic model code as a string
code = string_to_model_code("User", "name:string, email:email, active:bool")
print(code)
# Output:
# from pydantic import BaseModel
# from typing import Optional
#
# class User(BaseModel):
# name: str
# email: str
# active: bool
# Perfect for code generators, templates, or saving to files
with open('models.py', 'w') as f:
f.write(code)
from string_schema import string_to_openapi
# Direct conversion: String syntax → OpenAPI schema
openapi_schema = string_to_openapi("name:string, email:email")
print(openapi_schema)
# {
# "type": "object",
# "properties": {
# "name": {"type": "string"},
# "email": {"type": "string", "format": "email"}
# },
# "required": ["name", "email"]
# }
String Schema provides complete bidirectional conversion between all schema formats!
⚠️ Information Loss Notice: Reverse conversions (from JSON Schema/OpenAPI/Pydantic back to string syntax) may lose some information due to format differences. However, the resulting schemas are designed to cover the most common use cases and maintain functional equivalence for typical validation scenarios.
from string_schema import model_to_string, string_to_model
# Create a model first
UserModel = string_to_model("name:string, email:email, active:bool")
# Reverse engineer it back to string syntax
schema_string = model_to_string(UserModel)
print(f"Model schema: {schema_string}")
# Output: "name:string, email:email, active:bool"
from string_schema import json_schema_to_string
# Example: Convert existing JSON Schema to String Schema syntax
json_schema = {
"type": "object",
"properties": {
"name": {"type": "string"},
"email": {"type": "string", "format": "email"},
"age": {"type": "integer"}
},
"required": ["name", "email"]
}
simple_syntax = json_schema_to_string(json_schema)
print(f"Converted: {simple_syntax}")
# Output: "name:string, email:email, age:int?"
from string_schema import string_to_model, model_to_string
# Create two versions of a model
UserV1 = string_to_model("name:string, email:email")
UserV2 = string_to_model("name:string, email:email, active:bool")
# Compare them
v1_str = model_to_string(UserV1)
v2_str = model_to_string(UserV2)
print("Schema changes:")
print(f"V1: {v1_str}") # "name:string, email:email"
print(f"V2: {v2_str}") # "name:string, email:email, active:bool"
string
, int
, number
, bool
→ Basic data typesemail
, url
, datetime
, date
, uuid
, phone
→ Special validated typesfield_name:type
→ Required fieldfield_name:type?
→ Optional fieldfield_name:type(constraints)
→ Field with validationstring(min=1, max=100)
→ Length constraintsint(0, 120)
→ Range constraints[string]
→ Simple arrays[{name:string, email:email}]
→ Object arraysstatus:enum(active, inactive)
→ Enum valuesid:string|uuid
→ Union types📖 Complete syntax guide: See docs/string-syntax.md for full reference
from string_schema import validate_string_syntax
# Validate your schema syntax
result = validate_string_syntax("name:string, email:email, age:int?")
print(f"Valid: {result['valid']}") # True
print(f"Features used: {result['features_used']}") # ['basic_types', 'optional_fields']
print(f"Field count: {len(result['parsed_fields'])}") # 3
# Example with invalid syntax
bad_result = validate_string_syntax("name:invalid_type")
print(f"Valid: {bad_result['valid']}") # False
print(f"Errors: {bad_result['errors']}") # ['Unknown type: invalid_type']
from string_schema import string_to_json_schema
# Define what data to extract
schema = string_to_json_schema("company:string, employees:[{name:string, email:email}], founded:int?")
# Use schema in LLM prompts for structured data extraction
from string_schema import string_to_model
# Create model for FastAPI
UserModel = string_to_model("name:string, email:email")
# Use in FastAPI endpoints
@app.post("/users/")
async def create_user(user: UserModel):
return {"id": 123, "name": user.name, "email": user.email}
from string_schema import string_to_model_code
# Generate model code
code = string_to_model_code("User", "name:string, email:email")
print(code)
# Output: Complete Pydantic model class as string
# Save to file
with open('user_model.py', 'w') as f:
f.write(code)
from string_schema import string_to_json_schema
# Validate app configuration
config_schema = string_to_json_schema("database:{host:string, port:int}, debug:bool")
# Use for validating config files
Ready-to-use schema examples are available in the examples/
directory:
# Import example schemas if needed
from string_schema.examples.presets import user_schema, product_schema
from string_schema.examples.recipes import create_ecommerce_product_schema
# Or better yet, use string syntax directly:
user_schema = string_to_json_schema("name:string, email:email, age:int?")
The library includes comprehensive tests covering all functionality:
# Run tests (requires pytest)
pip install pytest
pytest tests/
# Results: 66 tests passed, 3 skipped (Pydantic tests when not installed)
Contributions are welcome! The codebase is well-organized and documented:
string_schema/
├── core/ # Core functionality (fields, builders, validators)
├── parsing/ # String parsing and syntax
├── integrations/ # Pydantic, JSON Schema, OpenAPI
└── examples/ # Built-in schemas and recipes
This project is licensed under the MIT License - see the LICENSE file for details.
String Schema - Making data validation simple, intuitive, and LLM-friendly! 🚀
FAQs
A simple, LLM-friendly schema definition library for converting string syntax to structured schemas
We found that string-schema demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
/Security News
A flawed sandbox in @nestjs/devtools-integration lets attackers run code on your machine via CSRF, leading to full Remote Code Execution (RCE).
Product
Customize license detection with Socket’s new license overlays: gain control, reduce noise, and handle edge cases with precision.
Product
Socket now supports Rust and Cargo, offering package search for all users and experimental SBOM generation for enterprise projects.