MongoDB TypeGen
A command-line tool to connect to a MongoDB database, inspect collections, and automatically generate Python TypedDict
models. This brings static type checking to your MongoDB documents, helping you write safer, more maintainable code.
Key Features
- Automatic Schema Inference: Analyzes documents to infer field names and types.
- Robust
TypedDict
Generation: Creates Python TypedDict
classes for each collection.
- Powerful CLI: Offers multiple commands (
generate
, list-collections
, preview
) for a streamlined workflow.
- Handles Complex Schemas:
- Recursively generates
TypedDict
classes for nested documents.
- Correctly identifies optional fields and fields with multiple
Union
types.
- Preserves original field names, even those with spaces or special characters.
- Flexible & Configurable:
- Filter for specific collections to include or exclude.
- Customize the number of documents to sample for schema inference.
- Perform a "dry run" to see generated code without writing to a file.
Installation
pip install mongodb-typegen
Usage
The primary command is generate
, which creates the Python models file.
mongodb-typegen generate --db <database_name>
You can also use other commands like list-collections
and preview
for a better workflow.
Commands
generate
Generate TypedDict
models from your MongoDB collections.
--uri | -u | mongodb://localhost:27017/ | MongoDB connection string. |
--db | -d | (Required) | Name of the MongoDB database. |
--out | -o | generated_models.py | Output file path for generated models. |
--sample-size | -s | 100 | Number of documents to sample per collection. |
--collections | -c | | Comma-separated list of collections to process. |
--exclude | -e | | Comma-separated list of collections to exclude. |
--dry-run | | | Show generated code without writing to a file. |
--verbose | -v | | Enable verbose output. |
--quiet | -q | | Suppress all output except errors. |
Example:
mongodb-typegen generate --db analytics
mongodb-typegen generate --db ecommerce --collections users,products --out models/db_types.py
mongodb-typegen generate --db app_logs --collections logs --dry-run
list-collections
List all collections in a specified database.
--uri | -u | mongodb://localhost:27017/ | MongoDB connection string. |
--db | -d | (Required) | Name of the MongoDB database. |
Example:
mongodb-typegen list-collections --db my_app
preview
Preview the inferred schema and TypedDict
for a single collection without generating a file. This is useful for quick inspection.
--uri | -u | mongodb://localhost:27017/ | MongoDB connection string. |
--db | -d | (Required) | Name of the MongoDB database. |
COLLECTION_NAME | | (Required) | The name of the collection to preview. |
--sample-size | -s | 10 | Number of documents to sample for the preview. |
Example:
mongodb-typegen preview --db my_app users
Example Output
Given a collection named users
with documents like this:
{
"_id": ObjectId("60d5f3f7e8b4f6f8f8f8f8f8"),
"full name": "Alice",
"email": "alice@example.com",
"age": 30,
"is_active": true,
"profile": {
"bio": "Developer",
"website": "https://a.com"
}
}
The tool will generate the following TypedDict
classes:
from typing import TypedDict, List, Optional, Union, Any
from datetime import datetime
from bson.objectid import ObjectId
UsersProfile = TypedDict("UsersProfile", {
'bio': str,
'website': str
})
Users = TypedDict("Users", {
'_id': ObjectId,
'age': int,
'email': str,
'full name': str,
'is_active': bool,
'profile': UsersProfile
})
License
This project is licensed under the MIT License. See the LICENSE file for details.