MockAI
False LLM endpoints for testing
MockAI provides a local server that interops with multiple LLM SDKs, so you can call these APIs as normal but receive mock or pre-determined responses at no cost!
The package currently provides full support for OpenAI and Anthropic. It patches these libraries directly under the hood, so it will always be up to date.
Installation
pip install ai-mock
poetry add ai-mock
uv add ai-mock
Usage
Start the MockAI server
This is the server that the mock clients will communicate with, we'll see later how we can configure our own pre-determined responses :).
$ mockai
Chat Completions
To use a mock version of these providers, you only have to change a single line of code (and just barely!):
- from openai import OpenAI # Real Client
+ from mockai.openai import OpenAI # Fake Client
client = OpenAI()
response = client.chat.completions.create(
model="gpt-5",
messages=[
{
"role": "user",
"content": "Hi Mock!"
}
],
temperate = 0.7,
top_k = 0.95
)
print(response.choices[0].message.content)
Alternatively, you can use the real SDK and set the base url to the MockAI server address
from openai import OpenAI
client = OpenAI(api_key="not used but required", base_url="http://localhost:8100/openai")
response = client.chat.completions.create(
model="gpt-5",
messages=[
{
"role": "user",
"content": "Hi Mock!"
}
],
temperate = 0.7,
top_k = 0.95
)
print(response.choices[0].message.content)
MockAI also provides clients for Anthropic:
from mockai.anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
model="claude-3.5-opus",
messages=[{"role": "user", "content": "What's up!"}],
max_tokens=1024
)
print(response.content)
And of course the async versions of all clients are supported:
from mockai.openai import AsyncOpenAI
from mockai.anthropic import AsyncAnthropic
Streaming is supported as well:
from mockai.openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-5",
messages=[{"role": "user", "content": "Hi mock!"}],
stream = True
)
for chunk in response:
if chunk.choices:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content)
To learn more about the usage of each client, you can look at the docs of the respective provider, the mock clients are the exact same!
Tool Calling
All mock clients also work with tool calling! To trigger a tool call, you must specify it in a pre-determined response.
from mockai.openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-5",
messages=[{"role": "user", "content": "Function!"}],
)
print(response.choices[0].message.tool_calls[0].function.name)
print(response.choices[0].message.tool_calls[0].function.arguments)
Configure responses
The MockAI server takes an optional path to a JSON file were we can establish our responses for both completions and tool calls. The structure of the json is simple: Each object must have a "type" key of value "text" or "function", an input key with a value, which is what will be matched against, and an output key, which is what will be returned if the input key matches the user input.
[
{
"type": "text",
"input": "How are ya?",
"output": "I'm fine, thank u 😊. How about you?"
},
{
"type": "function",
"input": "Where's my order?",
"output": {
"name": "get_delivery_date",
"arguments": {
"order_id": "1337"
}
}
}
]
When creating your .json file, please follow these rules:
- Each response must have a
type
key, whose value must be either text
or function
, this will determine the response object of the client. - Responses of type
text
must have a output
key with a string value. - Responses of type
function
must have a name
key with the name of the function, and a arguments
key with a dict of args and values (Example: {"weather": "42 degrees Fahrenheit"}). - Responses of type
function
can accept a list of objects, to simulate parallel tool calls.
Load the json file
To create a MockAI server with our json file, we just need to pass it to the mockai command.
$ mockai mock_responses.json
$ mockai ~/home/foo/bar/mock_responses.json
With this, our mock clients will have access to our pre-determined responses!
from mockai.openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-5",
messages=[{"role": "user", "content": "How are ya?"}],
)
print(response.choices[0].message.content)
response = client.chat.completions.create(
model="gpt-5",
messages=[{"role": "user", "content": "Where's my order?"}],
)
print(response.choices[0].message.tool_calls[0].function.name)
print(response.choices[0].message.tool_calls[0].function.arguments)