Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement →

anymodel

Advanced tools

License

Install Socket

Detect and block malicious and high-risk dependencies

Install

anymodel - npm Package Compare versions

Comparing version

1.14.1

1.15.0

+183

providers/skill-bridge.mjs

		// providers/skill-bridge.mjs — increment 0013 (universal skill loader).
		//
		// SKILL.md is ONE shared open standard (agentskills.io): Claude Code, OpenAI/Codex,
		// Google Gemini/Antigravity, Cursor, Copilot and Goose all write the SAME format —
		// a directory `<name>/SKILL.md` (YAML frontmatter + Markdown body, optional
		// scripts/references/assets). Only the DISCOVERY PATH differs.
		//
		// The bundled Claude Code client only scans `.claude/skills`. Rather than patch its
		// (minified) loader, AnyModel bridges at LAUNCH time: discover foreign-ecosystem
		// skills, symlink each into a per-session temp `.claude/skills` shadow, and pass
		// `--add-dir <shadow>` — the client already scans the `.claude/skills` of every
		// added directory, so its native SKILL.md reader + progressive disclosure handle
		// everything. Zero format translation. Works for the bundled client and stock `claude`.

		import {
		existsSync, readdirSync, statSync, lstatSync, realpathSync,
		mkdtempSync, mkdirSync, symlinkSync, rmSync,
		} from 'fs';
		import { join, isAbsolute, resolve, sep } from 'path';
		import { tmpdir } from 'os';

		// Directory-symlink type: 'junction' on Windows (needs no privilege), else 'dir'.
		const LINK_TYPE = process.platform === 'win32' ? 'junction' : 'dir';

		// Conventional foreign skill roots (relative to a base dir), in precedence order.
		export const FOREIGN_SKILL_DIRS = [
		'.agents/skills', // cross-tool interop convention (Codex, Cursor, Copilot, Goose, Gemini CLI)
		'.codex/skills', // OpenAI Codex
		'.gemini/skills', // Gemini CLI
		'.agent/skills', // Google Antigravity (singular)
		];

		// Default fs surface — injectable so the pure logic is testable without touching disk.
		const realFs = {
		existsSync, readdirSync, statSync, lstatSync, realpathSync,
		mkdtempSync, mkdirSync, symlinkSync, rmSync,
		};

		/**
		* Resolve absolute roots to scan: FOREIGN_SKILL_DIRS under cwd then $HOME, plus any
		* colon-separated ANYMODEL_SKILL_ROOTS resolved to absolute (relative entries against
		* cwd — a relative root would otherwise produce a broken symlink). De-duped, order
		* preserved. Pure (no fs).
		*/
		export function resolveForeignSkillRoots({ cwd, homeDir = '', env = {} } = {}) {
		const roots = [];
		if (cwd) for (const d of FOREIGN_SKILL_DIRS) roots.push(join(cwd, d));
		if (homeDir) for (const d of FOREIGN_SKILL_DIRS) roots.push(join(homeDir, d));
		const extra = env.ANYMODEL_SKILL_ROOTS;
		if (extra) {
		for (const r of String(extra).split(':').map(s => s.trim()).filter(Boolean)) {
		roots.push(isAbsolute(r) ? r : resolve(cwd \|\| '.', r));
		}
		}
		return [...new Set(roots)];
		}

		/**
		* Discover skills under the given roots. Each entry: a directory `<root>/<name>` that
		* contains a `SKILL.md`. Optional Codex sidecar `<dir>/agents/openai.yaml` is noted.
		* fs is injectable for testing. Order: roots in order, names in directory order.
		* @returns {{name, root, dir, skillMdPath, sidecarPath: string\|null}[]}
		*/
		export function discoverForeignSkills(roots, fs = realFs) {
		const found = [];
		const seen = new Set();
		for (const root of roots) {
		if (!fs.existsSync(root)) continue;
		// Canonicalize for dedup so an aliased/symlinked root (e.g. cwd === a symlink to
		// $HOME) doesn't double-discover and produce misleading "duplicate name" shadows.
		let canonRoot = root;
		try { canonRoot = fs.realpathSync(root); } catch { /* keep raw on failure */ }
		if (seen.has(canonRoot)) continue;
		seen.add(canonRoot);
		let names;
		try { names = fs.readdirSync(root); } catch { continue; }
		for (const name of names) {
		const dir = join(root, name);
		let lst;
		try { lst = fs.lstatSync(dir); } catch { continue; }
		if (lst.isSymbolicLink()) {
		// Containment: a symlinked entry must resolve INSIDE the scanned root.
		// Otherwise a `.codex/skills/x -> /Users/me/.ssh` link in an untrusted repo
		// would be --add-dir'd and exposed to the driven model. Skip escapers.
		let target;
		try { target = fs.realpathSync(dir); } catch { continue; }
		if (target !== canonRoot && !target.startsWith(canonRoot + sep)) continue;
		} else if (!lst.isDirectory()) {
		continue;
		}
		const skillMdPath = join(dir, 'SKILL.md');
		if (!fs.existsSync(skillMdPath)) continue;
		const sidecarPath = join(dir, 'agents', 'openai.yaml');
		found.push({
		name, root, dir, skillMdPath,
		sidecarPath: fs.existsSync(sidecarPath) ? sidecarPath : null,
		});
		}
		}
		return found;
		}

		/**
		* Plan which discovered skills to link. A project-local `.claude/skills/<name>` wins
		* (foreign same-name shadowed). Among foreign skills, the first occurrence of a `name`
		* wins; later duplicates are shadowed. Both shadow reasons are recorded. Pure.
		* @returns {{link: object[], shadowed: object[]}}
		*/
		export function planSkillBridge(discovered, projectSkillNames = []) {
		// Keys are lower-cased: the symlink shadow lives on a possibly case-INSENSITIVE FS
		// (macOS APFS, Windows NTFS), where 'Foo' and 'foo' collide. Normalizing here turns
		// that collision into a LOGGED shadow instead of a swallowed EEXIST at symlink time.
		const project = new Set(projectSkillNames.map(n => n.toLowerCase()));
		const claimed = new Set();
		const link = [], shadowed = [];
		for (const s of discovered) {
		const key = s.name.toLowerCase();
		if (project.has(key)) { shadowed.push({ ...s, reason: 'shadowed by project .claude/skills' }); continue; }
		if (claimed.has(key)) { shadowed.push({ ...s, reason: 'shadowed by earlier root (duplicate name)' }); continue; }
		claimed.add(key);
		link.push(s);
		}
		return { link, shadowed };
		}

		/**
		* Materialize the bridge: a temp dir with `.claude/skills/<name>` symlinks → skill dirs.
		* Unlinkable entries are recorded in `skipped` (with the errno) rather than silently
		* dropped — so callers can surface e.g. Windows-without-symlink-privilege. Returns null
		* if there was nothing to link; if links were planned but ALL failed, the temp dir is
		* removed and `bridgeDir` is null (no orphan). I/O (fs + tmpBase injectable).
		* @returns {{bridgeDir: string\|null, linked: string[], skipped: {name,code}[]}\|null}
		*/
		export function materializeSkillBridge(plan, { tmpBase = tmpdir(), fs = realFs } = {}) {
		if (!plan \|\| !plan.link \|\| !plan.link.length) return null;
		const bridgeDir = fs.mkdtempSync(join(tmpBase, 'anymodel-skills-'));
		const skillsDir = join(bridgeDir, '.claude', 'skills');
		fs.mkdirSync(skillsDir, { recursive: true });
		const linked = [], skipped = [];
		for (const s of plan.link) {
		try {
		fs.symlinkSync(s.dir, join(skillsDir, s.name), LINK_TYPE);
		linked.push(s.name);
		} catch (e) {
		skipped.push({ name: s.name, code: (e && e.code) \|\| 'ELINK' });
		}
		}
		if (!linked.length) {
		// Everything failed — don't leak the temp dir we just created.
		try { fs.rmSync(bridgeDir, { recursive: true, force: true }); } catch { /* best-effort */ }
		return { bridgeDir: null, linked, skipped };
		}
		return { bridgeDir, linked, skipped };
		}

		/**
		* Read the names of project-local skills (`<cwd>/.claude/skills/<name>/SKILL.md`) so
		* the bridge can let them win on a name collision. Best-effort. fs injectable.
		*/
		export function readProjectSkillNames(cwd, fs = realFs) {
		const dir = join(cwd, '.claude', 'skills');
		if (!fs.existsSync(dir)) return [];
		let names;
		try { names = fs.readdirSync(dir); } catch { return []; }
		return names.filter(name => {
		try {
		return fs.statSync(join(dir, name)).isDirectory() && fs.existsSync(join(dir, name, 'SKILL.md'));
		} catch { return false; }
		});
		}

		/**
		* End-to-end: resolve roots → discover → plan (with project precedence) → materialize.
		* Returns the full picture so callers can log discovery + shadowing. fs injectable.
		*/
		export function buildSkillBridge({ cwd, homeDir = '', env = {} } = {}, fs = realFs, { tmpBase } = {}) {
		const roots = resolveForeignSkillRoots({ cwd, homeDir, env });
		const discovered = discoverForeignSkills(roots, fs);
		const projectSkillNames = cwd ? readProjectSkillNames(cwd, fs) : [];
		const plan = planSkillBridge(discovered, projectSkillNames);
		const bridge = materializeSkillBridge(plan, { tmpBase, fs });
		return { roots, discovered, plan, bridge };
		}

+198

providers/skill-catalog.mjs

		// providers/skill-catalog.mjs — increment 0010 (local skill-fidelity).
		//
		// Claude Code injects its skill catalog as a <system-reminder> in the first user
		// message. On local providers AnyModel strips that block for latency, which kills
		// skill auto-trigger. These pure, dependency-free helpers let proxy.mjs re-inject a
		// compact, budgeted, DETERMINISTIC (name-sorted, date-free) skill index + a curated
		// behavioral core into the system prefix — so a tool-capable local model knows which
		// skills exist and that matching one is a blocking precondition, while the block stays
		// byte-stable for prefix-cache (KV) reuse.

		const CATALOG_HEADER = 'The following skills are available for use with the Skill tool:';

		// "- name: rest" where name may be namespaced with a colon (e.g. sw:do). The name is
		// non-greedy so it stops at the first colon FOLLOWED by whitespace, not the inner one.
		const SKILL_LINE = /^-\s+([A-Za-z0-9_][\w:.-]*?):\s+(.+)$/;

		const HEADER_LINE =
		'Available skills (call the Skill tool when a request matches — matching is a BLOCKING REQUIREMENT, call Skill FIRST):';

		function flattenText(messages) {
		if (!Array.isArray(messages)) return '';
		const parts = [];
		for (const msg of messages) {
		if (!msg) continue;
		if (typeof msg.content === 'string') {
		parts.push(msg.content);
		} else if (Array.isArray(msg.content)) {
		for (const block of msg.content) {
		if (block && block.type === 'text' && typeof block.text === 'string') parts.push(block.text);
		}
		}
		}
		return parts.join('\n');
		}

		// Drop the " - whenToUse" tail that Claude Code appends to skill descriptions, then
		// clamp. Em-dashes (—) inside descriptions are not hyphens, so they don't match " - ".
		function cleanDesc(rest, descChars, keepWhenToUse) {
		let desc = rest.trim();
		if (!keepWhenToUse) {
		const cut = desc.search(/\s-\s/);
		if (cut > 0) desc = desc.slice(0, cut).trim();
		}
		if (desc.length > descChars) desc = desc.slice(0, descChars).trimEnd();
		return desc;
		}

		/**
		* Harvest the skill catalog from a messages array (string content + text blocks).
		* @returns {{ skills: {name:string, desc:string}[], rawCount:number }}
		*/
		export function harvestSkillCatalog(messages, { descChars = 140, keepWhenToUse = false } = {}) {
		const text = flattenText(messages);
		const headerIdx = text.indexOf(CATALOG_HEADER);
		if (headerIdx === -1) return { skills: [], rawCount: 0 };

		const after = text.slice(headerIdx + CATALOG_HEADER.length);
		const skills = [];
		let started = false;
		for (const line of after.split('\n')) {
		if (line.includes('</system-reminder>')) break;
		const m = line.match(SKILL_LINE);
		if (m) {
		started = true;
		skills.push({ name: m[1], desc: cleanDesc(m[2], descChars, keepWhenToUse) });
		} else if (started && line.trim() === '') {
		break; // blank line after the bullet list ends the catalog
		}
		}
		return { skills, rawCount: skills.length };
		}

		function priority(name, projectSkills) {
		if (/^sw:/.test(name)) return 0; // SpecWeave core workflow
		if (projectSkills.includes(name)) return 0; // project-local skills
		return 1;
		}

		function scoreRelevance(skill, queryWords) {
		if (!queryWords.length) return 0;
		const hay = (skill.name + ' ' + skill.desc).toLowerCase();
		let score = 0;
		for (const w of queryWords) if (w.length > 2 && hay.includes(w)) score++;
		return score;
		}

		/**
		* Compress a harvested catalog to a budgeted, name-sorted index block.
		* Keeps sw:* + project skills first, ranks the rest by query relevance, fills within
		* budgetChars, degrades to names-only when a full line won't fit, drops the overflow.
		* @returns {{ block:string, kept:number, dropped:number }}
		*/
		export function selectSkills(skills, { budgetChars = 4000, query = '', fidelity = 'balanced', projectSkills = [] } = {}) {
		if (!skills \|\| skills.length === 0) return { block: '', kept: 0, dropped: 0 };

		const queryWords = (query \|\| '').toLowerCase().split(/[^a-z0-9]+/).filter(Boolean);

		// Rank: priority tier, then relevance, then name (stable deterministic tiebreak).
		const ranked = [...skills].sort((a, b) => {
		const pa = priority(a.name, projectSkills), pb = priority(b.name, projectSkills);
		if (pa !== pb) return pa - pb;
		const ra = scoreRelevance(a, queryWords), rb = scoreRelevance(b, queryWords);
		if (ra !== rb) return rb - ra;
		return a.name.localeCompare(b.name);
		});

		let used = HEADER_LINE.length + 1;
		const chosen = [];
		for (const s of ranked) {
		const full = `- ${s.name}: ${s.desc}`;
		const nameOnly = `- ${s.name}`;
		if (used + full.length + 1 <= budgetChars) {
		chosen.push({ name: s.name, line: full });
		used += full.length + 1;
		} else if (used + nameOnly.length + 1 <= budgetChars) {
		chosen.push({ name: s.name, line: nameOnly });
		used += nameOnly.length + 1;
		}
		}

		if (chosen.length === 0) return { block: '', kept: 0, dropped: skills.length };

		// Display order is name-sorted (selection was relevance-based) → byte-stable prefix.
		chosen.sort((a, b) => a.name.localeCompare(b.name));
		const block = [HEADER_LINE, ...chosen.map(c => c.line)].join('\n');
		return { block, kept: chosen.length, dropped: skills.length - chosen.length };
		}

		function latestUserText(messages) {
		if (!Array.isArray(messages)) return '';
		for (let i = messages.length - 1; i >= 0; i--) {
		const m = messages[i];
		if (!m \|\| m.role !== 'user') continue;
		const t = typeof m.content === 'string' ? m.content
		: Array.isArray(m.content) ? m.content.filter(b => b && b.type === 'text').map(b => b.text).join(' ') : '';
		if (t) return t;
		}
		return '';
		}

		/**
		* The full decision: given a request's messages, produce the deterministic string to
		* append to the system prefix (behavioral core + budgeted skill index), or '' for lean.
		* Pure — the proxy passes resolved env values in. Returns rawCount so the caller can
		* warn when the Skill tool is present but the catalog header drifted (harvest empty).
		* @returns {{ addition:string, injected:number, rawCount:number }}
		*/
		export function buildFidelityAddition(messages, {
		fidelity = 'balanced',
		skillIndexMode = 'auto',
		descChars = 140,
		numCtx = 32768,
		systemPct = 0.08,
		} = {}) {
		if (fidelity === 'lean') return { addition: '', injected: 0, rawCount: 0 };

		const parts = [];
		const core = buildBehavioralCore(fidelity);
		if (core) parts.push(core);

		let injected = 0;
		let rawCount = 0;
		if (skillIndexMode !== 'off' && Array.isArray(messages)) {
		const dc = descChars * (fidelity === 'full' ? 2 : 1);
		const harvested = harvestSkillCatalog(messages, { descChars: dc, keepWhenToUse: fidelity === 'full' });
		rawCount = harvested.rawCount;
		if (harvested.skills.length) {
		const ctxBudgetChars = Math.floor(numCtx * systemPct) * 4;
		const budgetChars = fidelity === 'full'
		? Math.min(Math.max(ctxBudgetChars, 4000), 16000)
		: Math.min(4000, Math.max(ctxBudgetChars, 2000));
		const { block, kept } = selectSkills(harvested.skills, {
		budgetChars,
		query: latestUserText(messages),
		fidelity,
		});
		if (block) { parts.push(block); injected = kept; }
		}
		}
		return { addition: parts.join('\n\n'), injected, rawCount };
		}

		/**
		* Curated, date-free Claude Code behavioral core. ~600-900 tokens; lean → ''.
		*/
		export function buildBehavioralCore(fidelity = 'balanced') {
		if (fidelity === 'lean') return '';
		const core = [
		'You are an agentic coding assistant operating through the Claude Code tool protocol.',
		'Be terse and direct. Lead with the answer. Use the tools available to you rather than describing what you would do.',
		'Plan before acting on multi-step work; satisfy dependencies before dependent steps; verify changes before claiming success.',
		'SKILLS: When a user request matches one of the available skills listed below, calling the Skill tool with that skill name is a BLOCKING REQUIREMENT — call Skill FIRST, before any other response or tool use. "simple", "quick", and "basic" are NOT opt-out phrases.',
		];
		if (fidelity === 'full') {
		core.push('Prefer reusing existing functions and patterns over writing new code. Match the surrounding code style. Never invent file paths or APIs — verify they exist before referencing them.');
		}
		return core.join('\n');
		}

+64

-8

cli.mjs

		@@ -14,3 +14,3 @@ #!/usr/bin/env node
		import { spawn, execSync } from 'child_process';
		import { existsSync, writeFileSync, mkdirSync } from 'fs';
		import { existsSync, writeFileSync, mkdirSync, rmSync } from 'fs';
		import { fileURLToPath } from 'url';
		@@ -20,2 +20,3 @@ import { join, dirname } from 'path';
		import { createProxy, loadEnv } from './proxy.mjs';
		import { buildSkillBridge } from './providers/skill-bridge.mjs';

		@@ -61,3 +62,3 @@ const PROVIDERS = ['openrouter', 'ollama', 'openai', 'lmstudio', 'llamacpp'];
		export function parseArgs(argv) {
		const opts = { provider: 'auto', port: 9090, host: null, model: null, help: false, freeOnly: false, token: null, rpm: 60, passthrough: [], fullMcp: false };
		const opts = { provider: 'auto', port: 9090, host: null, model: null, help: false, freeOnly: false, token: null, rpm: 60, passthrough: [], fullMcp: false, localFidelity: null };

		@@ -100,2 +101,6 @@ for (let i = 0; i < argv.length; i++) {
		opts.fullMcp = true;
		} else if (arg === '--local-fidelity' \|\| arg.startsWith('--local-fidelity=')) {
		// Local skill-fidelity tier: lean \| balanced \| full (default balanced). 0010.
		const v = arg.includes('=') ? arg.split('=')[1] : argv[++i];
		opts.localFidelity = (v \|\| '').toLowerCase();
		} else if (!arg.startsWith('-') && MODEL_PRESETS[arg] && !opts.model) {
		@@ -260,2 +265,4 @@ opts.model = MODEL_PRESETS[arg];
		--full-mcp Keep all globally-configured MCP servers (default on local: suppress global MCP)
		--local-fidelity <tier> Local skill-fidelity: lean \| balanced \| full (default balanced).
		balanced re-injects a compact skill catalog so skills auto-trigger on local models.
		--help, -h Show this help
		@@ -366,2 +373,32 @@
		// ── Mode 1: Launch Claude Code directly (no proxy) ──
		// ── Universal skill bridge ───────────────────────────
		// Discover foreign-ecosystem skills (.agents / .codex / .gemini / .agent) — all of
		// which use the same open SKILL.md standard — and symlink them into a per-session temp
		// `.claude/skills` shadow passed to the client via `--add-dir`, so the client's native
		// SKILL.md loader picks them up. Best-effort: never blocks launch. Returns { args, cleanup }.
		export function setupSkillBridge() {
		try {
		const cwd = process.cwd();
		const homeDir = process.env.HOME \|\| process.env.USERPROFILE \|\| '';
		const { discovered, plan, bridge } = buildSkillBridge({ cwd, homeDir, env: process.env });
		if (bridge && bridge.bridgeDir && bridge.linked.length) {
		console.log(`${C.green('[anymodel]')} Bridged ${bridge.linked.length} skill(s) from .agents/.codex/.gemini: ${C.cyan(bridge.linked.join(', '))}`);
		if (plan.shadowed.length) {
		console.log(`${C.yellow('[anymodel]')} ${plan.shadowed.length} foreign skill(s) shadowed by name conflict`);
		}
		if (bridge.skipped && bridge.skipped.length) {
		console.log(`${C.yellow('[anymodel]')} ${bridge.skipped.length} skill(s) could not be linked (${bridge.skipped.map(s => s.code).join(', ')})`);
		}
		const dir = bridge.bridgeDir;
		return { args: ['--add-dir', dir], cleanup: () => { try { rmSync(dir, { recursive: true, force: true }); } catch {} } };
		}
		// Found skills but linked none (e.g. Windows without symlink privilege) — say so,
		// don't pretend nothing was there.
		if (discovered.length && bridge && bridge.skipped && bridge.skipped.length) {
		console.log(`${C.yellow('[anymodel]')} ${discovered.length} foreign skill(s) found but none could be linked (${bridge.skipped.map(s => s.code).join(', ')}) — skills not loaded.`);
		}
		} catch { /* best-effort: skill discovery never blocks the client */ }
		return { args: [], cleanup: () => {} };
		}

		function launchClaude() {
		@@ -384,3 +421,4 @@ const client = findClient();

		const clientChild = spawn(client.cmd, client.args, {
		const skillBridge = setupSkillBridge();
		const clientChild = spawn(client.cmd, [...client.args, ...skillBridge.args], {
		stdio: 'inherit',
		@@ -390,4 +428,6 @@ env: process.env,

		clientChild.on('exit', (code, signal) => process.exit(code ?? (signal ? 1 : 0)));
		process.on('SIGINT', () => { clientChild.kill('SIGTERM'); process.exit(0); });
		clientChild.on('exit', (code, signal) => { skillBridge.cleanup(); process.exit(code ?? (signal ? 1 : 0)); });
		clientChild.on('error', (e) => { skillBridge.cleanup(); console.error(`${C.red('Error:')} failed to launch client: ${e.message}`); process.exit(1); });
		process.on('SIGINT', () => { skillBridge.cleanup(); clientChild.kill('SIGTERM'); process.exit(0); });
		process.on('SIGTERM', () => { skillBridge.cleanup(); clientChild.kill('SIGTERM'); process.exit(0); });
		}
		@@ -460,6 +500,10 @@

		// Universal skill bridge — applies to every provider (cloud + local).
		const skillBridge = setupSkillBridge();
		autoArgs.push(...skillBridge.args);

		console.log(`${C.green('[anymodel]')} Starting...`);
		console.log('');

		// clientArgs: auto-injected strict-mcp-config (if local provider) + user passthrough
		// clientArgs: auto-injected strict-mcp-config + skill-bridge --add-dir + user passthrough
		const clientArgs = [...client.args, ...autoArgs, ...opts.passthrough];
		@@ -480,4 +524,6 @@ const clientChild = spawn(client.cmd, clientArgs, {

		clientChild.on('exit', (code, signal) => process.exit(code ?? (signal ? 1 : 0)));
		process.on('SIGINT', () => { clientChild.kill('SIGTERM'); process.exit(0); });
		clientChild.on('exit', (code, signal) => { skillBridge.cleanup(); process.exit(code ?? (signal ? 1 : 0)); });
		clientChild.on('error', (e) => { skillBridge.cleanup(); console.error(`${C.red('Error:')} failed to launch client: ${e.message}`); process.exit(1); });
		process.on('SIGINT', () => { skillBridge.cleanup(); clientChild.kill('SIGTERM'); process.exit(0); });
		process.on('SIGTERM', () => { skillBridge.cleanup(); clientChild.kill('SIGTERM'); process.exit(0); });
		}
		@@ -492,2 +538,12 @@

		// Local skill-fidelity tier (increment 0010) → drive the in-process proxy via env.
		const localFidelity = opts.localFidelity \|\| (process.env.ANYMODEL_LOCAL_FIDELITY \|\| '').toLowerCase();
		if (localFidelity) {
		if (!['lean', 'balanced', 'full'].includes(localFidelity)) {
		console.error(`${C.red('Error:')} --local-fidelity must be lean \| balanced \| full (got "${localFidelity}")`);
		process.exit(1);
		}
		process.env.LOCAL_FIDELITY = localFidelity;
		}

		let providerName = opts.provider;
		@@ -494,0 +550,0 @@ if (providerName === 'auto') {

+28

-0

LOCAL_SETUP.md

		@@ -221,3 +221,31 @@ # Running Claude Code locally through AnyModel → LMStudio
		\| `ANYMODEL_FULL_MCP` \| `0` \| Set to `1` to keep global MCP servers on local \|
		\| `LOCAL_FIDELITY` \| `balanced` \| Skill-fidelity tier — see below \|
		\| `LOCAL_SKILL_INDEX` \| `auto` \| `on`/`off`/`auto` — gate the skill index independently \|
		\| `LOCAL_MAX_SYSTEM_PCT` \| `0.08` \| Fraction of context budgeted for the curated system block \|
		\| `LOCAL_SKILL_DESC_CHARS` \| `140` \| Max chars per skill description in the index \|

		## Skill auto-trigger on local models (`--local-fidelity`)

		To stay fast, the proxy condenses Claude Code's 50-100 KB system prompt and strips the
		`<system-reminder>` that carries the skill catalog. Without that catalog a local
		model can't auto-trigger skills (it still has the `Skill` tool, but doesn't know which
		skills exist). The fidelity layer fixes this by re-injecting a compact, name-sorted skill
		index + a curated behavioral core into the system prefix — deterministic, so it's a
		one-time cost that the prefix cache reuses from turn 2 onward.

		```bash
		anymodel proxy lmstudio --local-fidelity balanced # default
		```

		\| Tier \| What it re-injects \| Cold turn-1 TTFT \| When to use \|
		\|---\|---\|---\|---\|
		\| `lean` \| nothing (current pre-0010 behavior) \| no change \| latency purists; you don't use skills locally \|
		\| `balanced` (default) \| curated behavioral core (~700 tok) + clipped skill index (≤~1000 tok, `whenToUse` dropped) \| +~0.7-1.3 s, then ~0 ms (KV reuse) \| the daily driver — skills auto-trigger \|
		\| `full` \| richer index (keeps `whenToUse`, higher clamp) + fuller rules \| +~1.7-3.3 s \| 131 K ctx or the 80B model \|

		Measured on M4 Max / qwen3-coder-30b MLX: `lean` triggers skills 0/12 of the time
		(catalog stripped); `balanced` triggers 9/12 (75%) with valid skill names. Run the
		harness yourself: `node test/skill-trigger-eval.mjs` (needs a running proxy). MCP
		suppression is unaffected by this flag — use `--full-mcp` for that.

		## The full three-command reference
		@@ -224,0 +252,0 @@

+1

-1

package.json

		{
		"name": "anymodel",
		"version": "1.14.1",
		"version": "1.15.0",
		"description": "Universal AI model proxy — route any coding tool through OpenRouter, Ollama, LMStudio, llama.cpp, or any LLM provider",
		@@ -5,0 +5,0 @@ "type": "module",

+46

-2

providers/ollama.mjs

		@@ -72,2 +72,44 @@ // Ollama provider for anymodel

		// Strip a `data:<mime>;base64,` prefix → raw base64. Ollama's native /api/chat
		// `images` field wants RAW base64 with NO data-URI prefix. Returns null when the
		// string is not a base64 data URI (e.g. an http(s) URL), which the native field
		// cannot carry.
		export function dataUriToBase64(url) {
		if (typeof url !== 'string') return null;
		const m = /^data:[^;,];base64,([\s\S])$/.exec(url);
		return m ? m[1] : null;
		}

		// Ollama's NATIVE /api/chat does NOT understand OpenAI content-part arrays
		// ({type:'image_url'\|'text'}). It wants `content` as a plain STRING plus a
		// top-level `images` array of RAW base64 on the message (see the official
		// docs/api.md "Chat request (with images)"). translateRequest() (shared with the
		// OpenAI provider) emits image_url parts, so we post-process here: collapse text
		// parts into the string content and hoist each base64 image into message.images.
		// A URL-sourced image can't be sent to the native field, so it becomes a visible
		// text marker — never a silent drop.
		export function toOllamaNativeMessages(messages) {
		return messages.map(msg => {
		if (!Array.isArray(msg.content)) return msg;
		const textParts = [];
		const images = [];
		for (const part of msg.content) {
		if (typeof part === 'string') { textParts.push(part); continue; }
		if (!part \|\| typeof part !== 'object') continue;
		if (part.type === 'text') { textParts.push(part.text \|\| ''); continue; }
		if (part.type === 'image_url') {
		const url = part.image_url?.url \|\| '';
		const b64 = dataUriToBase64(url);
		if (b64) images.push(b64);
		else if (url) textParts.push(`[image: ${url} — Ollama native API accepts base64 only]`);
		continue;
		}
		if (typeof part.text === 'string') textParts.push(part.text);
		}
		const out = { ...msg, content: textParts.join('') };
		if (images.length) out.images = images;
		return out;
		});
		}

		// SSE formatting helper
		@@ -246,6 +288,8 @@ function formatSSE(event, data) {

		// Build Ollama native request
		// Build Ollama native request. toOllamaNativeMessages converts any OpenAI
		// image_url content parts into the native message.images base64 array so
		// attached images survive to a multimodal model (llava, llama3.2-vision, …).
		const ollamaBody = {
		model: openaiBody.model,
		messages: openaiBody.messages,
		messages: toOllamaNativeMessages(openaiBody.messages),
		stream: openaiBody.stream \|\| false,
		@@ -252,0 +296,0 @@ think: false, // Disable thinking — this is why we use native API

+38

-2

providers/openai.mjs

		@@ -244,2 +244,24 @@ // OpenAI provider for anymodel

		// Parse the parenthesized args of a Qwen paren-style call, e.g.
		// `url="https://x", count=3, dry=true` → { url:'https://x', count:3, dry:true }.
		// Quote-aware: a comma inside a quoted value does NOT split the pair.
		function parseParenArgs(argStr) {
		const input = {};
		if (typeof argStr !== 'string' \|\| !argStr.trim()) return input;
		const pair = /([a-zA-Z_][\w.\-])\s=\s("(?:[^"\\]\|\\.)"\|'(?:[^'\\]\|\\.)'\|[^,]+?)(?=\s,\|\s*$)/g;
		let m;
		while ((m = pair.exec(argStr)) !== null) {
		const key = m[1];
		const raw = m[2].trim();
		let val;
		if ((raw.startsWith('"') && raw.endsWith('"')) \|\| (raw.startsWith("'") && raw.endsWith("'"))) {
		val = raw.slice(1, -1).replace(/\\(["'\\])/g, '$1');
		} else {
		try { val = JSON.parse(raw); } catch { val = raw; }
		}
		input[key] = val;
		}
		return input;
		}

		// Parse a text blob for tool-call syntax. Returns { calls: [{name,input}], cleanedText }.
		@@ -274,4 +296,18 @@ // Conservative: requires a strict whole-pattern match + valid structure so prose

		// 2) Qwen XML: <function=name><parameter=key>value</parameter>...</function>
		const qwenFn = /<function=([^>\s]+)\s>([\s\S]?)<\/function>/g;
		// 2) Qwen paren-style (qwen3-coder-next 80B hybrid):
		// <function=name(key="v", ...)> — optionally with a trailing </function> and
		// optionally wrapped in <tool_call>...</tool_call> (the wrapper's close tag may be
		// absent; the Hermes branch above leaves it because the body is not JSON). The
		// canonical branch below would mis-capture `name(key="v")` as the tool NAME, so
		// recover this form first and strip the whole span (wrapper included).
		const qwenParen = /(?:<tool_call>\s)?<function=\s([a-zA-Z_][\w.\-])\s$([\s\S]?)$\s>(?:\s<\/function>)?(?:\s<\/tool_call>)?/g;
		cleaned = cleaned.replace(qwenParen, (_match, name, argStr) => {
		calls.push({ name, input: parseParenArgs(argStr) });
		return '';
		});

		// 3) Qwen XML: <function=name><parameter=key>value</parameter>...</function>
		// Name excludes '(' so a paren-style call that slipped past branch 2 is never
		// mis-captured here.
		const qwenFn = /<function=([^>\s(]+)\s>([\s\S]?)<\/function>/g;
		cleaned = cleaned.replace(qwenFn, (match, name, inner) => {
		@@ -278,0 +314,0 @@ const input = {};

+64

-4

providers/responses.mjs

		@@ -43,2 +43,50 @@ // OpenAI Responses API ↔ Chat Completions bridge for LOCAL providers.

		// Chat Completions allows only these `detail` values; Responses additionally
		// permits 'original', which a Chat endpoint would reject.
		const CHAT_DETAILS = new Set(['low', 'high', 'auto']);

		// Pull a URL string out of a Responses input_image block. Responses uses
		// image_url as a flat STRING (data URI or https URL); be lenient and also accept
		// the Chat-style {url} object in case a client mislabels it.
		function imageUrlFromBlock(c) {
		if (typeof c.image_url === 'string') return c.image_url;
		if (c.image_url && typeof c.image_url === 'object' && typeof c.image_url.url === 'string') return c.image_url.url;
		return null;
		}

		// Translate a Responses content array → Chat Completions message content.
		// Text-only → a plain STRING (byte-stable; keeps every existing text turn
		// identical). When any input_image is present → an ARRAY of Chat Completions
		// parts: {type:'text'} and {type:'image_url', image_url:{url[, detail]}}. The key
		// translation is type input_image (image_url STRING) → type image_url (image_url
		// OBJECT {url}). A file_id-only image can't be inlined into a local Chat request,
		// so it becomes a visible marker — never a silent drop.
		export function responsesContentToChatParts(content) {
		if (typeof content === 'string') return content;
		if (!Array.isArray(content)) return '';
		const parts = [];
		let hasImage = false;
		for (const c of content) {
		if (typeof c === 'string') { parts.push({ type: 'text', text: c }); continue; }
		if (!c \|\| typeof c !== 'object') continue;
		if (c.type === 'input_image' \|\| c.type === 'image_url') {
		const url = imageUrlFromBlock(c);
		if (url) {
		const image_url = { url };
		if (CHAT_DETAILS.has(c.detail)) image_url.detail = c.detail;
		parts.push({ type: 'image_url', image_url });
		hasImage = true;
		} else if (c.file_id) {
		parts.push({ type: 'text', text: `[image file ${c.file_id} omitted — local server can't fetch uploaded files]` });
		} else {
		parts.push({ type: 'text', text: '[image omitted]' });
		}
		continue;
		}
		if (typeof c.text === 'string') parts.push({ type: 'text', text: c.text });
		}
		if (!hasImage) return parts.map(p => (p.type === 'text' ? p.text : '')).filter(Boolean).join('\n');
		return parts;
		}

		// Translate a Responses request body → a Chat Completions request body.
		@@ -69,3 +117,6 @@ // Returns the chat body (always non-streaming; the proxy synthesizes the SSE).
		const role = item.role === 'developer' ? 'system' : (item.role \|\| 'user');
		messages.push({ role, content: contentToText(item.content) });
		// Only user turns carry attached images; preserve them as Chat image parts.
		// system/assistant content stays a flat string (Chat system content is text).
		const content = role === 'user' ? responsesContentToChatParts(item.content) : contentToText(item.content);
		messages.push({ role, content });
		} else if (type === 'function_call') {
		@@ -83,5 +134,14 @@ // Assistant turn that issued a tool call.
		} else if (type === 'function_call_output') {
		// Tool result feeding back in.
		const out = typeof item.output === 'string' ? item.output : contentToText(item.output);
		messages.push({ role: 'tool', tool_call_id: item.call_id \|\| item.id, content: out });
		// Tool result feeding back in. The Chat `tool` role is text-only, so when a
		// tool returns image(s) we send the text in the tool message and hoist the
		// images into a following user turn (mirrors openai.mjs:116-128).
		const parts = responsesContentToChatParts(item.output);
		if (typeof parts === 'string') {
		messages.push({ role: 'tool', tool_call_id: item.call_id \|\| item.id, content: parts });
		} else {
		const text = parts.filter(p => p.type === 'text').map(p => p.text).join('\n');
		messages.push({ role: 'tool', tool_call_id: item.call_id \|\| item.id, content: text });
		const imgs = parts.filter(p => p.type === 'image_url');
		if (imgs.length) messages.push({ role: 'user', content: imgs });
		}
		}
		@@ -88,0 +148,0 @@ // Unknown item types (reasoning, etc.) are skipped — local models can't use them.

+15

-0

providers/tool-compressor.mjs

		@@ -12,2 +12,5 @@ // Smart tool optimization for local models (Ollama).
		const IMPORTANT = new Set(['Agent', 'TodoWrite', 'WebFetch', 'WebSearch', 'Skill', 'ToolSearch', 'NotebookEdit']);
		// Tools that must never be dropped by the budget loop (increment 0010): the re-injected
		// local skill catalog references Skill, and ToolSearch loads deferred tool schemas.
		const NEVER_EVICT = new Set(['Skill', 'ToolSearch']);

		@@ -137,3 +140,15 @@ // Fields worth keeping in JSON Schema for tool calling.

		// Never-evict guard (increment 0010): Skill + ToolSearch must always survive budget
		// pressure — the re-injected local skill catalog points the model at the Skill tool,
		// and ToolSearch loads deferred tool schemas. Force-include them first so a tiny
		// budget can't drop them (they are IMPORTANT-tier and would otherwise break the loop).
		for (const tool of sorted) {
		if (NEVER_EVICT.has(tool.name)) {
		selected.push(tool);
		usedTokens += estimateTokens(tool);
		}
		}

		for (const tool of sorted) {
		if (NEVER_EVICT.has(tool.name)) continue; // already force-included
		if (selected.length >= cap) break;
		@@ -140,0 +155,0 @@ const tokens = estimateTokens(tool);

cli.js

Sorry, the diff of this file is too big to display

proxy.mjs

Sorry, the diff of this file is too big to display

		@@ -221,3 +221,31 @@ # Running Claude Code locally through AnyModel → LMStudio
		\| `ANYMODEL_FULL_MCP` \| `0` \| Set to `1` to keep global MCP servers on local \|
		\| `LOCAL_FIDELITY` \| `balanced` \| Skill-fidelity tier — see below \|
		\| `LOCAL_SKILL_INDEX` \| `auto` \| `on`/`off`/`auto` — gate the skill index independently \|
		\| `LOCAL_MAX_SYSTEM_PCT` \| `0.08` \| Fraction of context budgeted for the curated system block \|
		\| `LOCAL_SKILL_DESC_CHARS` \| `140` \| Max chars per skill description in the index \|

		## Skill auto-trigger on local models (`--local-fidelity`)

		To stay fast, the proxy condenses Claude Code's 50-100 KB system prompt and strips the
		`<system-reminder>` that carries the skill catalog. Without that catalog a local
		model can't auto-trigger skills (it still has the `Skill` tool, but doesn't know which
		skills exist). The fidelity layer fixes this by re-injecting a compact, name-sorted skill
		index + a curated behavioral core into the system prefix — deterministic, so it's a
		one-time cost that the prefix cache reuses from turn 2 onward.

		```bash
		anymodel proxy lmstudio --local-fidelity balanced # default
		```

		\| Tier \| What it re-injects \| Cold turn-1 TTFT \| When to use \|
		\|---\|---\|---\|---\|
		\| `lean` \| nothing (current pre-0010 behavior) \| no change \| latency purists; you don't use skills locally \|
		\| `balanced` (default) \| curated behavioral core (~700 tok) + clipped skill index (≤~1000 tok, `whenToUse` dropped) \| +~0.7-1.3 s, then ~0 ms (KV reuse) \| the daily driver — skills auto-trigger \|
		\| `full` \| richer index (keeps `whenToUse`, higher clamp) + fuller rules \| +~1.7-3.3 s \| 131 K ctx or the 80B model \|

		Measured on M4 Max / qwen3-coder-30b MLX: `lean` triggers skills 0/12 of the time
		(catalog stripped); `balanced` triggers 9/12 (75%) with valid skill names. Run the
		harness yourself: `node test/skill-trigger-eval.mjs` (needs a running proxy). MCP
		suppression is unaffected by this flag — use `--full-mcp` for that.

		## The full three-command reference
		@@ -224,0 +252,0 @@

anymodel - npm Package Compare versions

New alerts

Fixed alerts

Improved metrics

Worsened metrics