
Security News
The Hidden Blast Radius of the Axios Compromise
The Axios compromise shows how time-dependent dependency resolution makes exposure harder to detect and contain.
python-toon
Advanced tools
TOON (Token-Oriented Object Notation) encoder/decoder for Python - Bidirectional JSON-to-TOON converter optimized for LLMs
Token-Oriented Object Notation for Python
A compact data format optimized for transmitting structured information to Large Language Models (LLMs) with 30-60% fewer tokens than JSON.
pip install python-toon
TOON (Token-Oriented Object Notation) combines YAML's indentation-based structure for nested objects and CSV's tabular format for uniform data rows, optimized specifically for token efficiency in LLM contexts.
This is a faithful Python implementation maintaining 100% output compatibility with the official TOON specification.
[N] for validationfrom toon import encode
# Simple object
data = {"name": "Alice", "age": 30}
print(encode(data))
# Output:
# name: Alice
# age: 30
# Tabular array (uniform objects)
users = [
{"id": 1, "name": "Alice", "age": 30},
{"id": 2, "name": "Bob", "age": 25},
{"id": 3, "name": "Charlie", "age": 35},
]
print(encode(users))
# Output:
# [3,]{id,name,age}:
# 1,Alice,30
# 2,Bob,25
# 3,Charlie,35
# Complex nested structure
data = {
"metadata": {"version": 1, "author": "test"},
"items": [
{"id": 1, "name": "Item1"},
{"id": 2, "name": "Item2"},
],
"tags": ["alpha", "beta", "gamma"],
}
print(encode(data))
# Output:
# metadata:
# version: 1
# author: test
# items[2,]{id,name}:
# 1,Item1
# 2,Item2
# tags[3]: alpha,beta,gamma
Command-line tool for converting between JSON and TOON formats.
# Encode JSON to TOON (auto-detected by .json extension)
toon input.json -o output.toon
# Decode TOON to JSON (auto-detected by .toon extension)
toon data.toon -o output.json
# Use stdin/stdout
echo '{"name": "Ada"}' | toon -
# Output: name: Ada
# Force encode mode
toon data.json --encode
# Force decode mode
toon data.toon --decode
# Custom delimiter
toon data.json --delimiter "\t" -o output.toon
# With length markers
toon data.json --length-marker -o output.toon
# Lenient decoding (disable strict validation)
toon data.toon --no-strict -o output.json
| Option | Description |
|---|---|
-o, --output <file> | Output file path (prints to stdout if omitted) |
-e, --encode | Force encode mode (overrides auto-detection) |
-d, --decode | Force decode mode (overrides auto-detection) |
--delimiter <char> | Array delimiter: , (comma), \t (tab), | (pipe) |
--indent <number> | Indentation size (default: 2) |
--length-marker | Add # prefix to array lengths (e.g., items[#3]) |
--no-strict | Disable strict validation when decoding |
encode(value, options=None)Converts a Python value to TOON format.
Parameters:
value (Any): JSON-serializable value to encodeoptions (dict, optional): Encoding optionsReturns: str - TOON-formatted string
Example:
from toon import encode
data = {"id": 123, "name": "Ada"}
toon_str = encode(data)
print(toon_str)
# Output:
# id: 123
# name: Ada
decode(input_str, options=None)Converts a TOON-formatted string back to Python values.
Parameters:
input_str (str): TOON-formatted string to parseoptions (DecodeOptions, optional): Decoding optionsReturns: Python value (dict, list, or primitive)
Example:
from toon import decode
toon_str = """items[2]{sku,qty,price}:
A1,2,9.99
B2,1,14.5"""
data = decode(toon_str)
print(data)
# Output: {'items': [{'sku': 'A1', 'qty': 2, 'price': 9.99}, {'sku': 'B2', 'qty': 1, 'price': 14.5}]}
from toon import encode
encode(data, {
"indent": 2, # Spaces per indentation level (default: 2)
"delimiter": ",", # Delimiter for arrays: "," | "\t" | "|" (default: ",")
"lengthMarker": "#" # Optional marker prefix: "#" | False (default: False)
})
from toon import decode, DecodeOptions
options = DecodeOptions(
indent=2, # Expected number of spaces per indentation level (default: 2)
strict=True # Enable strict validation (default: True)
)
data = decode(toon_str, options)
Strict Mode:
By default, the decoder validates input strictly:
"\x", unterminated stringsSet strict=False to allow lenient parsing.
You can use string literals directly:
data = [1, 2, 3, 4, 5]
# Comma (default)
print(encode(data))
# [5]: 1,2,3,4,5
# Tab
print(encode(data, {"delimiter": "\t"}))
# [5 ]: 1 2 3 4 5
# Pipe
print(encode(data, {"delimiter": "|"}))
# [5|]: 1|2|3|4|5
Or use the string keys:
encode(data, {"delimiter": "comma"}) # Default
encode(data, {"delimiter": "tab"}) # Tab-separated
encode(data, {"delimiter": "pipe"}) # Pipe-separated
Add the # prefix to array length indicators:
users = [
{"id": 1, "name": "Alice"},
{"id": 2, "name": "Bob"},
]
# Without marker (default)
print(encode(users))
# [2,]{id,name}:
# 1,Alice
# 2,Bob
# With marker
print(encode(users, {"lengthMarker": "#"}))
# [#2,]{id,name}:
# 1,Alice
# 2,Bob
Key-value pairs with primitives or nested structures:
{"name": "Alice", "age": 30}
# =>
# name: Alice
# age: 30
Arrays always include length [N]:
[1, 2, 3, 4, 5]
# => [5]: 1,2,3,4,5
["alpha", "beta", "gamma"]
# => [3]: alpha,beta,gamma
Uniform objects with identical primitive-only fields use CSV-like format:
[
{"id": 1, "name": "Alice"},
{"id": 2, "name": "Bob"},
]
# =>
# [2,]{id,name}:
# 1,Alice
# 2,Bob
Note: The delimiter appears in the length bracket [2,] for tabular arrays.
Non-uniform data using list format with - markers:
[{"name": "Alice"}, 42, "hello"]
# =>
# [3]:
# - name: Alice
# - 42
# - hello
The length bracket format depends on the array type:
Tabular arrays (with fields):
[2,]{fields}: or [2|]{fields}: or [2\t]{fields}:Primitive arrays (no fields):
[3]: (delimiter hidden)[3|]: or [3\t]: (delimiter shown)Strings are quoted only when necessary (following the TOON specification):
null, true, false42, -3.14:, [, ], {, }, -, ",, |, or tab)"hello" # => hello (no quotes)
"hello world" # => hello world (internal spaces OK)
" hello" # => " hello" (leading space requires quotes)
"null" # => "null" (keyword)
"42" # => "42" (looks like number)
"" # => "" (empty)
Non-JSON types are normalized automatically:
nullnull0When using TOON with LLMs:
Wrap in code blocks for clarity:
```toon
name: Alice
age: 30
```
Instruct the model about the format:
"Respond using TOON format (Token-Oriented Object Notation). Use
key: valuesyntax, indentation for nesting, and tabular format[N,]{fields}:for uniform arrays."
Leverage length markers for validation:
encode(data, {"lengthMarker": "#"})
Tell the model: "Array lengths are marked with [#N]. Ensure your response matches these counts."
Acknowledge tokenizer variance: Token savings depend on the specific tokenizer and model being used.
import json
from toon import encode
data = {
"users": [
{"id": 1, "name": "Alice", "age": 30, "active": True},
{"id": 2, "name": "Bob", "age": 25, "active": True},
{"id": 3, "name": "Charlie", "age": 35, "active": False},
]
}
json_str = json.dumps(data)
toon_str = encode(data)
print(f"JSON: {len(json_str)} characters")
print(f"TOON: {len(toon_str)} characters")
print(f"Reduction: {100 * (1 - len(toon_str) / len(json_str)):.1f}%")
# Output:
# JSON: 177 characters
# TOON: 85 characters
# Reduction: 52.0%
JSON output:
{"users": [{"id": 1, "name": "Alice", "age": 30, "active": true}, {"id": 2, "name": "Bob", "age": 25, "active": true}, {"id": 3, "name": "Charlie", "age": 35, "active": false}]}
TOON output:
users[3,]{id,name,age,active}:
1,Alice,30,true
2,Bob,25,true
3,Charlie,35,false
This project uses uv for fast, reliable package and environment management.
# Install uv if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone the repository
git clone https://github.com/toon-format/toon-python.git
cd toon-python
# Create virtual environment and install dependencies
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install package in editable mode with dev dependencies
uv pip install -e ".[dev]"
# Clone the repository
git clone https://github.com/toon-format/toon-python.git
cd toon-python
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install in development mode
pip install -e .
# Install development dependencies
pip install -r requirements-dev.txt
# Run all tests
pytest
# Run with coverage
pytest --cov=toon --cov-report=term
mypy src/toon
ruff check src/toon tests
This project is a Python implementation of the TOON format.
MIT License - see LICENSE file for details
Contributions are welcome! Please feel free to submit a Pull Request.
When contributing, please:
For bugs and feature requests, please open an issue.
FAQs
TOON (Token-Oriented Object Notation) encoder/decoder for Python - Bidirectional JSON-to-TOON converter optimized for LLMs
We found that python-toon demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
The Axios compromise shows how time-dependent dependency resolution makes exposure harder to detect and contain.

Research
A supply chain attack on Axios introduced a malicious dependency, plain-crypto-js@4.2.1, published minutes earlier and absent from the project’s GitHub releases.

Research
Malicious versions of the Telnyx Python SDK on PyPI delivered credential-stealing malware via a multi-stage supply chain attack.