
Research
Shai-Hulud Descends to Hades: Miasma Worm Campaign Spreads with New PyPI Wave
Socket found 37 malicious PyPI wheels that abuse Python startup hooks to launch a Bun-powered credential stealer tied to Mini Shai-Hulud/Miasma.
@askalf/brio
Advanced tools
The capability layer for AI workloads — semantic cache, cost-aware tiering, structured cost reporting, policy enforcement. Sits in front of any Anthropic-compat endpoint (dario, api.anthropic.com, OpenRouter, vLLM, Ollama).
The capability layer for AI workloads. Cache, tier, cost-report, govern.
A local middleware that sits in front of any Anthropic-compatible endpoint — your own dario proxy, api.anthropic.com direct, OpenRouter, Groq, vLLM, Ollama, anything that speaks the Messages API. Adds semantic response caching, cost-aware model tiering, structured per-conversation cost reporting, and policy enforcement on top of whichever backend you're already running. Zero hosted dependencies, MIT, runs on your laptop.
Independent, unofficial, third-party — see DISCLAIMER.md.
You're paying Anthropic per token (or via subscription routed through dario) and watching half the spend go to questions you've already answered. A coding agent reads the same package.json thirty times in a session. A research agent re-fetches the same source on the second turn. A team of five engineers asks the same dependency-version question across the week. Every one of those costs you the full prompt tokens on every hit, even when the answer was identical the last time.
That's the wedge. brio caches the prompt-response pair under a semantic key and serves the cached answer when the same shape recurs, saving tokens, latency, and rate-limit headroom. Cache miss → request flows through to your backend untouched. Cache hit → response comes back in single-digit milliseconds with the cached answer marked so the calling agent knows it's a replay.
That's v0.1. Around it: cost-aware model tiering (route easy prompts to Haiku, hard ones to Opus, by length + complexity heuristics), structured cost reports (per-conversation, per-user, per-day), and a policy layer (model allowlists, cost caps, PII redaction). Eventually team mode (multi-user auth + per-user quotas + audit log) for org adoption.
client (Cursor, Aider, Continue, client (Claude Code,
custom code, etc.) OpenClaw, Hermes, etc.)
│ │
└─────► http://localhost:8765 ◄──────────────┘
│
brio ─── cache (semantic key)
│ ─── cost report
│ ─── tier (haiku / sonnet / opus)
│ ─── policy (allowlists, caps, DLP)
│
▼
┌─────────────────────────────────────┐
│ ANY Anthropic-compatible endpoint │
│ │
│ - http://localhost:3456 (dario) │ ← Claude Max via OAuth
│ - https://api.anthropic.com │ ← per-token API
│ - https://openrouter.ai/v1 │ ← OpenRouter
│ - http://localhost:11434 │ ← Ollama, etc.
└─────────────────────────────────────┘
brio doesn't replace dario. dario solves "speak Anthropic's wire shape exactly so my Claude Max subscription works outside Claude Code." brio solves "make every backend smarter about cost, latency, and policy." Composing them: clients hit brio, brio caches what it can, the rest flows to dario, dario routes to your subscription. Either layer can run alone; neither requires the other.
# 1. Install.
npm install -g @askalf/brio
# 2. Point brio at whichever backend you want to wrap. Default is dario at :3456.
brio start # wraps dario on localhost:3456
brio start --upstream=https://api.anthropic.com --api-key=$ANTHROPIC_API_KEY
brio start --upstream=https://openrouter.ai/v1 --api-key=$OPENROUTER_API_KEY
# 3. Point your client at brio instead of the backend directly.
ANTHROPIC_BASE_URL=http://localhost:8765
ANTHROPIC_API_KEY=brio # any value when running through dario
# 4. Use whatever client you already use. Everything routes through brio.
claude # Claude Code
cursor # Cursor
aider --model=claude-opus-4-7 # Aider
Run brio cost after a session. You'll see the cache hit rate, the dollar value of replay traffic, and the per-conversation breakdown.
{model, system_prompt, messages, tools}. TTL configurable (default 1 hour). Hits return in single-digit ms with x-brio-cache: hit header. Disk-backed at ~/.brio/cache/<sha>.json. Verify with brio cache stats.{timestamp, model, inputTokens, outputTokens, cacheHit, latencyMs, conversationId}. brio cost summarizes per-day, per-conversation, per-model. brio cost --json for piping into your own dashboards./v1/files or other side-channels — all forwarded byte-for-byte to the upstream. brio is additive; it shouldn't change what works.brio doctor — health check across upstream reachability, cache directory writability, and a smoke probe to verify the upstream is what you said it was.--tier=auto routes prompts under N tokens to Haiku, complex ones to Opus, with explainable decisions surfaced via brio explain <request-id>.brio policy validate lints the file.| Flag | Default | Why |
|---|---|---|
--upstream <url> | http://localhost:3456 | Where requests go on cache miss. Anthropic-compat endpoint. |
--port <n> | 8765 | brio's listen port. |
--api-key <k> | — | API key brio sends to upstream when upstream isn't dario. |
--cache-ttl <ms> | 3600000 (1h) | TTL on cache entries. 0 disables caching. |
--cache-dir <path> | ~/.brio/cache | Where cache files live. |
--no-cache | off | Bypass cache for this run. |
--no-cost | off | Suppress per-request cost line on stderr. |
--verbose, -v | — | Stream cache hits / misses / forward decisions to stderr. |
--upstream-format <anthropic|openai> | auto | Wire format the upstream expects. Auto-detected from URL. |
Every flag mirrors a BRIO_* env var. CLI wins over env.
| Signal | Status |
|---|---|
| Runtime dependencies | Two — one HTTP framework, one schema validator. Pinned, audited. No hosted services, no telemetry. |
| Credentials | API keys live in env vars or CLI flags; brio never persists them. Cache files store request + response payloads only. |
| Network scope | Whatever upstream you point at, plus the cache TTL clock (no external time service). No other outbound traffic. Verify with lsof -i during a run. |
| Telemetry | None. Zero analytics, tracking, or data collection. Deliberately, not aspirationally. |
| License | MIT |
See DISCLAIMER.md for the full AS IS / no-affiliation / user-responsibility terms.
dario — wire-fidelity LLM router. brio's default upstream. Stable maintenance mode (drift watch only); brio is where active feature work lives.hands — computer-use agent. Routes through brio (or dario, or anything Anthropic-compat) like any other client.arnie — IT troubleshooting companion. Same — client of brio.deepdive — local research agent. Same — client of brio.askalf (the org) is the umbrella; the future commercial chat/agent product called askalf is something else and not what brio is.
PRs welcome. Code style matches dario — small TypeScript, pure decision functions, node --test assertions on anything with logic in it. Run npm run build && npm test before submitting.
MIT — see LICENSE and DISCLAIMER.md.
FAQs
The capability layer for AI workloads — semantic cache, cost-aware tiering, structured cost reporting, policy enforcement. Sits in front of any Anthropic-compat endpoint (dario, api.anthropic.com, OpenRouter, vLLM, Ollama).
We found that @askalf/brio demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Research
Socket found 37 malicious PyPI wheels that abuse Python startup hooks to launch a Bun-powered credential stealer tied to Mini Shai-Hulud/Miasma.

Security News
RubyGems and Bundler 4.0.13 introduced an opt-in cooldown feature that delays newly published gems during dependency resolution.

Security News
pnpm 11.5 now recognizes npm staged publish approvals in release metadata, preventing those releases from being mistaken for lower-trust package publishes.