
Company News
/Security News
Socket Selected for OpenAI's Cybersecurity Grant Program
Socket is an initial recipient of OpenAI's Cybersecurity Grant Program, which commits $10M in API credits to defenders securing open source software.
[ with scope read:packages.
Add the token to your environment variables. For example, on macOS/Linux:
export NPM_TOKEN=<token_generated_in_previous_step>
.npmrc with the following content:@qvac:registry=https://registry.npmjs.org
//registry.npmjs.org/:_authToken=${NPM_TOKEN}
Alternatively, you can create the .npmrc file in your home directory, for user-level (global) configuration.
npm whoami --registry=https://npm.pkg.github.com
It should return your GitHub username.
npm i @qvac/sdk
[!TIP]
If you can't get the package via Tether's private GitHub package registry, see section Build from source.
[!IMPORTANT]
If you're on Linux ensure that the Vulkan SDK is installed (e.g.apt install vulkan-sdkon Ubuntu)
npm i expo-file-system react-native-bare-kit
If you are on Android, bump your minSdkVersion to 29. You can do so by adding ext { minSdkVersion=29 } to your android/build.gradle or by using expo-build-properties.
Add the QVAC Expo plugin to your app.json:
export default {
expo: {
plugins: ["@qvac/sdk/expo-plugin"],
},
};
npx expo prebuild
npx expo run:ios --device
# or
npx expo run:android --device
[!IMPORTANT]
Due to limitations withllamacpp, QVAC currently does not run on emulators. You must use a physical device.
You can run the self-contained example below:
import { loadModel, completion, unloadModel } from "@qvac/sdk";
// Load model into memory
const modelId = await loadModel(
"pear://afa79ee07c0a138bb9f11bfaee771fb1bdfca8c82d961cff0474e49827bd1de3/Llama-3.2-1B-Instruct-Q4_0.gguf",
{
modelType: "llm",
},
);
// Use the loaded model
const response = completion({
modelId,
history: [{ role: "user", content: "What is the capital of France?" }],
stream: false,
});
const text = await response.text;
console.log(text);
// Enable KV cache for faster subsequent completions
const cachedResponse = completion({
modelId,
history: [
{ role: "user", content: "What is the capital of France?" },
{ role: "assistant", content: "The capital of France is Paris." },
{ role: "user", content: "What about Germany?" },
],
stream: true,
kvCache: true, // enables caching for faster inference
});
// You can use the loaded model multiple times
// Unload model to free up system resources
await unloadModel({ modelId });
process.exit();
For more on how to use QVAC SDK, see SDK at QVAC documentation.
Use the Bun package manager:
bun i
bun run build # or `watch` for hotreload
bun run build:pack
This outputs a tarball under dist/qvac-sdk-{version}.tgz that you can install in your project, e.g.:
npm i path/to/qvac-sdk-0.3.0.tgz
In the ./examples subdirectory, you will find scripts demonstrating SDK usage. To try any of them:
# With Bare
bun run bare:example dist/examples/path/to/example.js
# With Node
node dist/examples/path/to/example.js
# With bun, straight from source
bun run examples/path/to/example.ts
QVAC SDK includes built-in log propagation that streams model logs from the worker process to your client application in real-time. This gives you visibility into what's happening inside your models during loading, inference, and other operations.
Simply pass a logger when loading your model:
import { getLogger, loadModel, completion, unloadModel } from "@qvac/sdk";
// Create a logger for your application
const logger = getLogger("my-app", {
level: "debug", // Set log level
transports: [
// Optional: add custom log handlers
(level, namespace, message) => {
console.log(`[${level.toUpperCase()}] ${namespace}: ${message}`);
},
],
});
// Load model with logging - RPC stream starts automatically
const modelId = await loadModel(
"pear://afa79ee07c0a138bb9f11bfaee771fb1bdfca8c82d961cff0474e49827bd1de3/Llama-3.2-1B-Instruct-Q4_0.gguf",
{
modelType: "llm",
logger, // Enable log streaming
},
);
// Now you'll see model logs in your console as operations happen
for await (const token of completion({
modelId,
history: [{ role: "user", content: "Hello!" }],
}).tokenStream) {
process.stdout.write(token);
}
await unloadModel({ modelId }); // Logging stream closes automatically
When logging is enabled, you'll see real-time logs from the underlying model libraries:
[DEBUG] llamacpp:llm: Loading model weights...
[INFO] llamacpp:llm: Model loaded successfully, vocab_size=32000
[DEBUG] llamacpp:llm: Starting inference...
[DEBUG] llamacpp:llm: Inference completed, tokens=12
This works for all model types (LLM, Whisper, NMT, Embeddings) and provides valuable insight into model performance and behavior.
llama.cpp with local files: examples/llamacpp-filesystem.tsllama.cpp with Hyperdrive: examples/llamacpp-hyperdrive.tsllama.cpp with HTTP: examples/llamacpp-http.tsllama.cpp with file logging: examples/llamacpp-file-logging.tsllama.cpp with tools/function calls: examples/llamacpp-tools.tsllama.cpp with tools and routing: examples/llamacpp-tools-routing.tsllama.cpp with multimodal inference: examples/llamacpp-multimodal.tsllama.cpp with KV cache: examples/kv-cache-example.tswhisper.cpp transcription: examples/whispercpp-transcription.tsexamples/microphone-record-transcription.tsexamples/embed-hyperdrive.tsexamples/rag-hyperdb.tsexamples/rag-hyperdb-workspaces.ts (Demonstrates workspace isolation and chunking)examples/rag-lancedb.tsexamples/rag-chromadb.ts (Note: Requires ChromaDB server running)examples/rag-sqlite.ts (Note: Uses SQLite-Vector WASM)examples/translation-opus.tsexamples/translation-indic.tsexamples/translation-llm.tsexamples/text-to-speech.tsNote: To run the text-to-speech example, espeak data is required which can be obtained from here.
examples/multi-model-demo.tsexamples/delegated-inference/provider.tsexamples/delegated-inference/consumer.tsNote: Set the QVAC_HYPERSWARM_SEED env var to ensure that the provider's uses the same keypair (i.e. the public key doesn't change on every run).
Note: ⚠️ Consumer does not handle reconnection yet.
examples/download-with-cancel.tsexamples/llamacpp-sharded.tsexamples/llamacpp-filesystem.tsNote: Sharded models automatically download all parts sequentially with detailed progress tracking per shard. For local sharded models, pass the path to the first shard file (e.g., model-00001-of-00005.gguf), and remaining shards will be loaded automatically.
Blind relays help establish peer connections through NAT/firewalls by routing traffic through relay nodes.
examples/download-with-blind-relays.tssetConfig({ swarmRelays: [...relayPublicKeys] }) before starting your provider/consumer.Note: The example uses mock relay keys. In real deployments, you must use your own relay servers or trusted public relays.
One off scripts don't terminate / hang
process.exit() call at the end, we're working on it.I'm building my Expo app on iOS but I don't see logs from QVAC
npx expo startios folder in XcodeFAQs
**QVAC SDK** is the canonical entry point to develop AI applications with QVAC.
The npm package @qvac/sdk receives a total of 555 weekly downloads. As such, @qvac/sdk popularity was classified as not popular.
We found that @qvac/sdk demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 3 open source maintainers collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Company News
/Security News
Socket is an initial recipient of OpenAI's Cybersecurity Grant Program, which commits $10M in API credits to defenders securing open source software.

Security News
Socket CEO Feross Aboukhadijeh joins 10 Minutes or Less, a podcast by Ali Rohde, to discuss the recent surge in open source supply chain attacks.

Research
/Security News
Campaign of 108 extensions harvests identities, steals sessions, and adds backdoors to browsers, all tied to the same C2 infrastructure.