🚀 Socket Launch Week Day 5:Introducing Repository Access Permissions and Custom Roles.Learn more
Sign In

@qvac/opencode-plugin

Package Overview
Dependencies
Maintainers
2
Versions
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@qvac/opencode-plugin

OpenCode plugin that runs a local, managed QVAC serve so `opencode` works against on-device models with no second terminal

latest
Source
npmnpm
Version
0.1.0
Version published
Weekly downloads
21
-91.03%
Maintainers
2
Weekly downloads
 
Created
Source

@qvac/opencode-plugin

Run OpenCode against a local, on-device QVAC model with no second terminal and no manual server. Add the plugin to a project's opencode.json and opencode brings up a managed qvac serve by itself, points OpenCode at it, and tears it down on exit.

{
  "$schema": "https://opencode.ai/config.json",
  "plugin": ["@qvac/opencode-plugin"]
}
opencode          # interactive — uses qvac/qwen3.5-9b by default
opencode run "…"  # one-shot — works too (no startup race)

That's it: no provider block, no second terminal, no QVAC_MODEL= prefix.

How it works

  • On startup the plugin spawns a host child process in a real node/bun runtime. (OpenCode runs plugins inside its own compiled binary, whose process.execPath is the editor — not a JS runtime — so managed mode can't spawn its detached supervisor from there. The host gives it a real runtime, and means the serve is reaped even if OpenCode is killed hard.)
  • The host starts a small local proxy and immediately reports it is listening — before the model downloads. The plugin injects an OpenAI-compatible qvac provider pointed at the proxy and returns, so opencode run never trips OpenCode's startup timeout. The model loads in the background; the first turn waits on it (a slow cold turn, not a failure).
  • The host runs createQvac({ mode: 'managed' }) from @qvac/ai-sdk-provider, which brings up a shared, idle-reaped serve on an auto-allocated port.

Multiple OpenCode windows share one serve (the provider's reuse default): the detached runner owns the loaded model and reaps it a few minutes after the last session leaves, so a second window doesn't reload the model.

Model ids

You pick a friendly, models.dev-style id (qwen3.5-9b) and that exact id flows through the whole stack — OpenCode's model picker (qvac/qwen3.5-9b) and the request model field. The verbose QVAC constant (QWEN3_5_9B_MULTIMODAL_Q4_K_M) stays an internal detail of the serve; the friendly-id → constant mapping lives in @qvac/ai-sdk-provider's qvacCatalog, so every AI-SDK tool resolves the same ids.

models.dev idQVAC constant
qwen3.5-0.8bQWEN3_5_0_8B_MULTIMODAL_Q4_K_M
qwen3.5-2bQWEN3_5_2B_MULTIMODAL_Q4_K_M
qwen3.5-4bQWEN3_5_4B_MULTIMODAL_Q4_K_M
qwen3.5-9bQWEN3_5_9B_MULTIMODAL_Q4_K_M

Passing a raw constant also works (it normalizes back to the friendly id for display).

Options

Set from any of these sources (lowest to highest precedence): built-in defaults, a qvac.json in the project dir, the opencode.json plugin-tuple options, and QVAC_* environment variables.

Option (qvac.json / plugin tuple)EnvDefaultMeaning
modelQVAC_MODELqwen3.5-9bfriendly id or a raw QVAC constant
ctxSizeQVAC_CTX_SIZE32768serve context window (an agent's prompt + tool schemas need ≥ 32768)
reasoningBudgetQVAC_REASONING_BUDGET-1-1 = reasoning on, 0 = off
toolsQVAC_TOOLStrueenable the tool-calling chat template
shimQVAC_SHIMtrueapply the OpenAI-compat transforms (see below)
runtimeQVAC_RUNTIMEautopath to the node/bun runtime that hosts the serve
readyTimeoutMsQVAC_READY_TIMEOUT_MS1800000budget for the serve to become healthy, incl. a cold model download
setDefaultModelQVAC_SET_DEFAULT_MODELtrueforce qvac/<model> as the project default + small model
debugQVAC_DEBUGfalsemirror host milestones + per-request traces to stderr

Via the plugin tuple in opencode.json:

{
  "$schema": "https://opencode.ai/config.json",
  "plugin": [["@qvac/opencode-plugin", { "model": "qwen3.5-2b" }]]
}

Or a qvac.json next to it:

{ "model": "qwen3.5-2b", "ctxSize": 32768 }

The shim option

@ai-sdk/openai-compatible (which OpenCode speaks) and QVAC serve disagree on two points today, so the host runs a small in-process proxy that bridges them:

  • array content — the AI SDK sends content as an array of typed parts; serve currently accepts only a string, so the proxy flattens text parts.
  • reasoning — with reasoning on, the model emits <think>…</think> inline on the content channel; the proxy re-routes that to reasoning_content so OpenCode shows a collapsed "Thought" block instead of raw tags.

Both are stopgaps for serve gaps. Set shim: false (or QVAC_SHIM=0) to turn the transforms off once serve closes those gaps; the proxy itself stays (it is what lets startup return before the model finishes loading).

Performance expectations

With the 9B model the agent's build prompt (~26k tokens with tool schemas) is re-prefilled each turn on a single local worker, so a tool-using turn is roughly 20–30s. A smaller model (qwen3.5-2b) is snappier but less capable for agentic work. Only one QVAC worker runs machine-wide; if the OpenCode desktop app is running it can hold locks the CLI needs — quit it (or isolate XDG_* dirs) when running opencode from the terminal.

Requirements

Keywords

qvac

FAQs

Package last updated on 19 Jun 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts