kosha-discovery — कोश
AI Model & Provider Discovery Registry
Kosha (कोश — treasury) discovers AI models across providers, resolves credentials, enriches with pricing, and exposes the catalog through a library, CLI, HTTP API, and a built-in OpenAI-compatible proxy. One source of truth for model identity, pricing, and routing — so your app doesn't break when providers ship new SKUs or change rates.
Install
npm install @sriinnu/kosha-discovery
npm install -g @sriinnu/kosha-discovery
Quick start
Library
import { createKosha } from "@sriinnu/kosha-discovery";
const kosha = await createKosha();
const models = kosha.models();
const cheapest = kosha.cheapestModels({ role: "image" });
const sonnet = kosha.model("sonnet");
console.log(sonnet.pricing);
CLI
kosha discover
kosha list --provider anthropic
kosha model sonnet
kosha cheapest --role embeddings
kosha update
kosha serve --port 3000
After each discovery, a stable v1 manifest lands at ~/.kosha/registry.json — any tool that reads JSON can consume it:
jq '.models[] | select(.pricing.inputPerMillion < 0.1) | .modelId' ~/.kosha/registry.json
HTTP API
GET /api/models[?provider=…&role=…] GET /api/models/:idOrAlias
GET /api/models/:idOrAlias/routes GET /api/models/cheapest?role=…
GET /api/providers GET /api/roles
POST /api/refresh GET /health
Proxy
Kosha runs as an OpenAI-compatible proxy. Point your SDK at http://localhost:3000/proxy/v1 and it resolves the model, picks the right provider, injects credentials, and forwards — streaming included.
kosha serve
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "http://localhost:3000/proxy/v1",
apiKey: "not-used",
});
const res = await client.chat.completions.create({
model: "sonnet",
messages: [{ role: "user", content: "hello" }],
});
const cheap = await client.chat.completions.create({
model: "kosha:cheapest",
messages: [{ role: "user", content: "hello" }],
});
const routed = await client.chat.completions.create({
model: "kosha:cheapest[tool_use,128k]",
messages: [{ role: "user", content: "hello" }],
});
kosha:cheapest filter syntax (comma-separated, combinable):
| capability | tool_use, vision | model must have this tag |
<N>k | 128k, 200k | minimum context window |
provider:<id> | provider:groq | pin to a specific provider |
The response always includes x-kosha-model, x-kosha-provider, and x-kosha-requested headers so the caller knows exactly what ran.
Supported transports: openai, openai-compatible-http, ollama. Anthropic, Google, Bedrock, and Vertex require wire-format translation — not yet proxied.
Supported providers
| Anthropic | /v1/models | ANTHROPIC_API_KEY, Claude CLI, Codex CLI |
| OpenAI | /v1/models | OPENAI_API_KEY, GitHub Copilot tokens |
| Google | /v1beta/models | GOOGLE_API_KEY, GEMINI_API_KEY, Gemini CLI, gcloud |
| AWS Bedrock | SDK → CLI → static | AWS_ACCESS_KEY_ID, ~/.aws/credentials, SSO, IAM |
| Vertex AI | API + gcloud | GOOGLE_APPLICATION_CREDENTIALS, ADC |
| Ollama | local API | — (local) |
| OpenRouter | API | OPENROUTER_API_KEY (optional) |
| Vercel AI Gateway | /v1/models | AI_GATEWAY_API_KEY, VERCEL_OIDC_TOKEN (public discovery, required for execution) |
| NVIDIA / Together / Fireworks / Groq / Cerebras / Cohere / DeepInfra / Perplexity | API | provider key env var |
| DeepSeek / Mistral / Moonshot (Kimi) / GLM (Zhipu) / Z.AI / MiniMax | API | provider key env var |
Full credential setup: docs/credentials.md.
Architecture
Discovery layer talks to provider APIs and local catalogs. Enrichment layer fills pricing and context windows from the LiteLLM catalog and models.dev. Resilience layer (circuit breaker + stale-cache fallback + health tracker) keeps a flaky provider a degraded read, never a crash. Manifest layer writes a v1-stable JSON snapshot so downstream consumers — tokmeter, chitragupta, ayuh — read prices from one source instead of inventing their own. Proxy layer exposes an OpenAI-compatible endpoint that resolves kosha:cheapest[…] hints at request time, injects credentials, and forwards to the winning provider.
Docs
| Credentials | Env vars, CLI tools, and config files for every provider |
| CLI | Commands, flags, examples |
| HTTP API | Endpoints, parameters, response schemas |
| Configuration | Aliases, routing, enrichment, programmatic config |
| Architecture | Discovery flow, module map, adding providers |
| Resilience | Circuit breakers, stale cache, health |
| Security | Threat catalogue, runtime scanning, pre-commit hook |
| Discovery Plane v1 | Stable daemon contract (deltas, SSE watch, binding hints) |
Release
Tag-driven via GitHub Actions:
git tag -s vX.Y.Z -m "vX.Y.Z" && git push origin vX.Y.Z
The workflow checks tag ↔ package.json match, builds, tests, publishes to npm, and creates the GitHub Release. Requires the NPM_TOKEN secret.
Credits
litellm (pricing data) · openrouter · ollama · chitragupta (registry patterns) · takumi (routing needs that drove kosha's creation).
License
MIT