
Product
Introducing Repository Labels and Security Policies
Socket is introducing a new way to organize repositories and apply repository-specific security policies.
MLX Omni Server is a server that provides OpenAI-compatible APIs using Apple's MLX framework.
MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. It implements OpenAI-compatible API endpoints, enabling seamless integration with existing OpenAI SDK clients while leveraging the power of local ML inference.
The server implements OpenAI-compatible endpoints:
/v1/chat/completions
/v1/audio/speech
- Text-to-Speech/v1/audio/transcriptions
- Speech-to-Text/v1/models
- List models/v1/models/{model}
- Retrieve or Delete model/v1/images/generations
- Image generation# Install using pip
pip install mlx-omni-server
There are two ways to use MLX Omni Server:
# If installed via pip as a package
mlx-omni-server
You can use --port
to specify a different port, such as: mlx-omni-server --port 10240
. The default port is 10240.
You can view more startup parameters by using mlx-omni-server --help
.
from openai import OpenAI
# Configure client to use local server
client = OpenAI(
base_url="http://localhost:10240/v1", # Point to local server
api_key="not-needed" # API key is not required for local server
)
For development or testing, you can use TestClient to interact directly with the application without starting a server:
from openai import OpenAI
from fastapi.testclient import TestClient
from mlx_omni_server.main import app
# Use TestClient to interact directly with the application
client = OpenAI(
http_client=TestClient(app) # Use TestClient directly, no network service needed
)
Regardless of which method you choose, you can use the client in the same way:
# Chat Completion Example
chat_completion = client.chat.completions.create(
model="mlx-community/Llama-3.2-1B-Instruct-4bit",
messages=[
{"role": "user", "content": "What can you do?"}
]
)
# Text-to-Speech Example
response = client.audio.speech.create(
model="lucasnewman/f5-tts-mlx",
input="Hello, welcome to MLX Omni Server!"
)
# Speech-to-Text Example
audio_file = open("speech.mp3", "rb")
transcript = client.audio.transcriptions.create(
model="mlx-community/whisper-large-v3-turbo",
file=audio_file
)
# Image Generation Example
image_response = client.images.generate(
model="argmaxinc/mlx-FLUX.1-schnell",
prompt="A serene landscape with mountains and a lake",
n=1,
size="512x512"
)
You can view more examples in examples.
We welcome contributions! If you're interested in contributing to MLX Omni Server, please check out our Development Guide for detailed information about:
For major changes, please open an issue first to discuss what you would like to change.
This project is licensed under the MIT License - see the LICENSE file for details.
This project is not affiliated with or endorsed by OpenAI or Apple. It's an independent implementation that provides OpenAI-compatible APIs using Apple's MLX framework.
FAQs
MLX Omni Server is a server that provides OpenAI-compatible APIs using Apple's MLX framework.
We found that mlx-omni-server demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Product
Socket is introducing a new way to organize repositories and apply repository-specific security policies.
Research
Security News
Socket researchers uncovered malicious npm and PyPI packages that steal crypto wallet credentials using Google Analytics and Telegram for exfiltration.
Product
Socket now supports .NET, bringing supply chain security and SBOM accuracy to NuGet and MSBuild-powered C# projects.