
Security News
npm Tooling Bug Incorrectly Marks One-Character Packages as Security Holders
npm confirmed a tooling bug incorrectly marked several one-character packages as security holders and said it was working on a rollback.
apple-local-llm
Advanced tools
Call Apple's on-device Foundation Models from JavaScript — no servers, no setup.
Works with Node.js, Electron, and VS Code extensions.
npm install apple-local-llm
import { createClient } from "apple-local-llm";
const client = createClient();
// Check compatibility first
const compat = await client.compatibility.check();
if (!compat.compatible) {
console.log("Not available:", compat.reasonCode);
// Handle fallback to cloud API
}
// Generate a response
const result = await client.responses.create({
input: "What is the capital of France?",
});
if (result.ok) {
console.log(result.text); // "The capital of France is Paris."
}
for await (const chunk of client.stream({ input: "Count from 1 to 5." })) {
if ("delta" in chunk) {
process.stdout.write(chunk.delta);
}
}
createClient(options?)Creates a new client instance.
const client = createClient({
model: "default", // Optional: model identifier (currently only "default")
onLog: (msg) => console.log(msg), // Optional: debug logging
idleTimeoutMs: 5 * 60 * 1000, // Optional: helper idle timeout (default: 5 min)
});
Defaults:
timeoutMs)You can also import and instantiate the class directly:
import { AppleLocalLLMClient } from "apple-local-llm";
const client = new AppleLocalLLMClient(options);
client.compatibility.check()Check if the local model is available. Always call this before making requests.
const result = await client.compatibility.check();
// { compatible: true }
// or { compatible: false, reasonCode: "AI_DISABLED" }
Reason codes:
| Code | Description |
|---|---|
NOT_DARWIN | Not running on macOS |
UNSUPPORTED_HARDWARE | Not Apple Silicon |
AI_DISABLED | Apple Intelligence not enabled |
MODEL_NOT_READY | Model still downloading |
SPAWN_FAILED | Helper binary failed to start |
HELPER_NOT_FOUND | Helper binary not found |
HELPER_UNHEALTHY | Helper process not responding correctly |
PROTOCOL_MISMATCH | Helper version incompatible with client |
client.capabilities.get()Get detailed model capabilities (calls the helper).
const caps = await client.capabilities.get();
// { available: true, model: "apple-on-device" }
// or { available: false, reasonCode: "AI_DISABLED" }
client.responses.create(params)Generate a response.
const result = await client.responses.create({
input: "Your prompt here",
model: "default", // Optional: model identifier
max_output_tokens: 500, // Optional: limit response tokens
stream: false, // Optional
signal: abortController.signal, // Optional: AbortSignal
timeoutMs: 60000, // Optional: request timeout (ms)
response_format: { // Optional: structured JSON output
type: "json_schema",
json_schema: {
name: "Result",
schema: { type: "object", properties: { ... } }
}
}
});
Structured Output Example:
const result = await client.responses.create({
input: "List 3 colors",
response_format: {
type: "json_schema",
json_schema: {
name: "Colors",
schema: {
type: "object",
properties: {
colors: { type: "array", items: { type: "string" } }
}
}
}
}
});
const data = JSON.parse(result.text); // { colors: ["red", "blue", "green"] }
response_formatis not supported with streaming.
Returns ResponseResult on success, or an error object:
// Success:
{ ok: true, text: "...", request_id: "..." }
// Error:
{ ok: false, error: { code: "...", detail: "..." } }
Note: The return type is a discriminated union, not the exported ResponseResult interface.
Error codes:
| Code | Description |
|---|---|
UNAVAILABLE | Model not available (see reason codes above) |
TIMEOUT | Request timed out (default: 60s) |
CANCELLED | Request was cancelled via AbortSignal |
RATE_LIMITED | System rate limit exceeded |
GUARDRAIL | Content violated Apple's safety guidelines |
INTERNAL | Unexpected error |
client.stream(params)Async generator for streaming responses.
for await (const chunk of client.stream({ input: "..." })) {
if ("delta" in chunk) {
// Partial content
console.log(chunk.delta);
} else if ("done" in chunk) {
// Final complete text
console.log(chunk.text);
}
}
client.responses.cancel(requestId)Cancel an in-progress request.
const result = await client.responses.cancel("req_123");
// { ok: true } or { ok: false, error: { code: "NOT_RUNNING", detail: "..." } }
client.shutdown()Gracefully shut down the helper process.
await client.shutdown();
All types are exported:
import type {
ClientOptions,
ReasonCode,
CompatibilityResult,
CapabilitiesResult,
ResponsesCreateParams,
ResponseResult,
JSONSchema,
ResponseFormat,
} from "apple-local-llm";
The fm-proxy binary can also be used directly from the command line:
# Simple prompt
fm-proxy "What is the capital of France?"
# Streaming output
fm-proxy --stream "Tell me a story"
fm-proxy -s "Tell me a story"
# Limit output tokens
fm-proxy --max-tokens=50 "Count to 100"
# Start HTTP server
fm-proxy --serve
fm-proxy --serve --port=3000
# Other options
fm-proxy --help # Show usage (or -h)
fm-proxy --version # Show version (or -v)
fm-proxy --stdio # stdio mode (used internally by npm package)
Run fm-proxy --serve to start a local HTTP server:
fm-proxy --serve --port=8080
Endpoints:
| Endpoint | Method | Description |
|---|---|---|
/health | GET | Health check and availability status |
/generate | POST | Text generation (supports streaming) |
Options:
| Option | Description |
|---|---|
--port=<PORT> | Set server port (default: 8080) |
--auth-token=<TOKEN> | Require Bearer token for /generate |
You can also set AUTH_TOKEN environment variable instead of --auth-token.
CORS: All endpoints support CORS with Access-Control-Allow-Origin: *.
Examples:
# Health check
curl http://127.0.0.1:8080/health
# Response: {"status":"ok","model":"apple-on-device","available":true}
# Simple generation
curl -X POST http://127.0.0.1:8080/generate \
-H "Content-Type: application/json" \
-d '{"input": "What is 2+2?"}'
# Response: {"text":"2+2 equals 4."}
# With max_output_tokens
curl -X POST http://127.0.0.1:8080/generate \
-H "Content-Type: application/json" \
-d '{"input": "Count to 100", "max_output_tokens": 50}'
# With structured output (response_format)
curl -X POST http://127.0.0.1:8080/generate \
-H "Content-Type: application/json" \
-d '{"input": "List 3 colors", "response_format": {"type": "json_schema", "json_schema": {"name": "Colors", "schema": {"type": "object", "properties": {"colors": {"type": "array", "items": {"type": "string"}}}}}}}'
# With authentication
curl -X POST http://127.0.0.1:8080/generate \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <token>" \
-d '{"input": "Hello"}'
Add "stream": true to get Server-Sent Events with OpenAI-compatible chunks:
curl -N -X POST http://127.0.0.1:8080/generate \
-H "Content-Type: application/json" \
-d '{"input": "Write a haiku", "stream": true}'
Response:
data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"role":"assistant"}}]}
data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"content":"..."}}]}
data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{},"finish_reason":"stop"}]}
data: [DONE]
This package bundles a small native helper (fm-proxy) that communicates with Apple's Foundation Models framework over stdio. The helper is spawned on first request and stays alive to keep the model warm.
npm installcompatibility.check() and fall back to cloudJS API (createClient()):
| Environment | Supported |
|---|---|
| Node.js | ✅ |
| Electron (main process) | ✅ |
| VS Code extensions | ✅ |
| Electron (renderer) | ❌ No child_process |
| Browser | ❌ |
HTTP Server (fm-proxy --serve):
| Environment | Supported |
|---|---|
| Any HTTP client | ✅ |
| Browser (fetch) | ✅ |
| Electron (renderer) | ✅ |
MIT
FAQs
Call Apple's on-device Foundation Models — no servers, no setup.
We found that apple-local-llm demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
npm confirmed a tooling bug incorrectly marked several one-character packages as security holders and said it was working on a rollback.

Research
/Security News
Newer packages in this compromise use native extensions and .pth loaders to execute JavaScript stealers in developer environments.

Research
Socket found 37 malicious PyPI wheels that abuse Python startup hooks to launch a Bun-powered credential stealer tied to Mini Shai-Hulud/Miasma.