
Product
Introducing Repository Access Permissions and Custom Roles
Socket now supports Custom Roles and Repository Access Permissions so organizations can control who can access specific repositories and actions.
@cantoo/capacitor-onnx
Advanced tools
Capacitor plugin for native ONNX Runtime inference on Android, iOS and Web
Capacitor plugin for ONNX Runtime inference on Android, iOS and Web.
2.0.0 removes the plugin-side model cache. The plugin no longer downloads, validates, or stores model files — it is now a thin wrapper around ONNX Runtime sessions. The host app owns model storage and provides bytes (web) or a filesystem path (native).
LoadModelInput no longer accepts url, sha256, forceRedownload, or timeoutMs. Pass either filePath (iOS/Android) or modelBuffer: Uint8Array (web).LoadModelResult no longer includes status (cache_hit / downloaded).clearModel and clearAllCache have been removed. release(modelId, version) still releases the in-memory ORT session.CapacitorOnnxWeb.setWebConfig has been removed — configure wasmPath (and multithread) via sessionOptions.web on loadModel instead.NETWORK_ERROR, INTEGRITY_ERROR, and MODEL_INTEGRITY_ERROR are no longer reachable.Before:
await CapacitorOnnx.loadModel({
modelId: 'demo-model',
version: '1.0.0',
url: 'https://example.com/model.onnx',
sha256: 'abc...',
});
After (native, iOS/Android):
// Download/cache the model in your app code, e.g. via @capacitor/filesystem.
// Then pass the absolute or file:// path to the plugin.
await CapacitorOnnx.loadModel({
modelId: 'demo-model',
version: '1.0.0',
filePath: '/data/user/0/com.app/files/models/demo-model-1.0.0.onnx',
});
After (web):
const response = await fetch('https://example.com/model.onnx');
const modelBuffer = new Uint8Array(await response.arrayBuffer());
await CapacitorOnnx.loadModel({
modelId: 'demo-model',
version: '1.0.0',
modelBuffer,
});
Passing modelBuffer on iOS/Android or filePath on web rejects with MODEL_INVALID — the Capacitor bridge serializes Uint8Array inefficiently (base64 / number array), so native callers must always use filesystem paths.
pnpm add @cantoo/capacitor-onnx
pnpm cap sync android
pnpm cap sync ios
pnpm cap sync android registers the plugin automatically; no manual MainActivity edits are required. The host app must satisfy:
minSdk ≥ 24 (Android 7.0).compileSdk ≥ 34.sourceCompatibility / targetCompatibility / kotlinOptions.jvmTarget = '17'), so any newer JDK (e.g. 21) also works — 17 is just the floor.The com.microsoft.onnxruntime:onnxruntime-android dependency is bundled by the plugin's build.gradle — you do not need to add it yourself. Tune execution providers and threading through sessionOptions (see docs/android-optimization.md).
iOS supports both CocoaPods (default for Capacitor apps) and Swift Package Manager.
CocoaPods (recommended for Capacitor apps). pnpm cap sync ios registers the plugin automatically: the generated Podfile picks up CantooCapacitorOnnx.podspec from node_modules/@cantoo/capacitor-onnx, and pod install resolves onnxruntime-objc transitively. No manual Xcode steps are required.
Swift Package Manager (alternative). If the host app prefers SPM, skip the Podfile entry and add the plugin as a local package in Xcode (Package Dependencies → +, pointing to node_modules/@cantoo/capacitor-onnx). Xcode resolves onnxruntime-swift-package-manager transitively. Add the CapacitorOnnx product to the App target.
Requirements either way:
CapacitorOnnxPlugin.m; no additional Swift code is required.onnxruntime-web requires the page to be served as a cross-origin isolated context — without it the multi-threaded WASM backend falls back (or fails) and SharedArrayBuffer is unavailable. The host page must be served with the following response headers:
Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: require-corp
Plus, any cross-origin asset the page loads (model files, WASM artifacts, fonts, images) needs Cross-Origin-Resource-Policy: cross-origin (or same-site) on its response, otherwise it will be blocked under COEP. CDN/Storage hosting your .onnx artifacts must also send permissive CORS headers (Access-Control-Allow-Origin).
The WASM path (and other web-only runtime tuning) is configured per loadModel call via sessionOptions.web:
await CapacitorOnnx.loadModel({
modelId: 'demo-model',
version: '1.0.0',
modelBuffer,
sessionOptions: {
web: {
wasmPath: '/ort-wasm/', // base path/URL for the ORT .wasm artifacts
multithread: false, // optional: single-thread + SIMD instead of auto-threading
},
},
});
Symptoms of missing isolation/CORS: SharedArrayBuffer is not defined, NetworkError when fetching .wasm, or models silently downgrading to single-threaded execution.
The package exports:
CapacitorOnnxCapacitorOnnxWeb (from @cantoo/capacitor-onnx/web for non-Capacitor hosts)definitions| Method | Signature | Purpose | Notes |
|---|---|---|---|
loadModel | (input: LoadModelInput) => Promise<LoadModelResult> | Creates an ONNX Runtime session from the model bytes (web) or file path (native), and optionally warms it up. Must be called once per modelId+version before run. | Native: pass filePath (absolute path or file:// URI). Web: pass modelBuffer: Uint8Array. Pass warmupInputs (a Record<string, RawTensor> keyed by model input name) to pay first-inference cost upfront, and sessionOptions to pick the execution provider / thread counts. The result includes executionProviderUsed. |
run | (input: RunInput) => Promise<RunResult> | Runs inference on a previously loaded session. | Pass inputs as a Record<string, RawTensor> keyed by the model's ONNX input names. Calls to the same modelId+version are serialized by a per-session lock; different models run in parallel. Returns { outputs, latencyMs }, where outputs is keyed by the model's output names. Pre/post-processing is the consumer's responsibility. |
release | (input: ReleaseModelInput) => Promise<void> | Releases the in-memory ONNX session for the given modelId+version. | Use to free RAM/GPU memory when you are done with a model. The host app is responsible for managing model files on disk. |
Type definitions for every input/result (e.g. LoadModelInput, RawTensor, SessionOptionsInput, PluginError) live in src/definitions.ts.
import { Capacitor } from '@capacitor/core';
import { CapacitorOnnx } from '@cantoo/capacitor-onnx';
async function loadDemoModel() {
if (Capacitor.getPlatform() === 'web') {
const response = await fetch('https://example.com/model.onnx');
const modelBuffer = new Uint8Array(await response.arrayBuffer());
await CapacitorOnnx.loadModel({
modelId: 'demo-model',
version: '1.0.0',
modelBuffer,
});
return;
}
// On iOS/Android, the host app is responsible for downloading
// the model to the filesystem (e.g. via @capacitor/filesystem).
await CapacitorOnnx.loadModel({
modelId: 'demo-model',
version: '1.0.0',
filePath: '/absolute/path/to/model.onnx',
});
}
await loadDemoModel();
const { outputs } = await CapacitorOnnx.run({
modelId: 'demo-model',
version: '1.0.0',
inputs: {
input_values: {
type: 'float32',
dims: [1, 16000],
data: [/* normalized audio samples */],
},
attention_mask: {
type: 'int64',
dims: [1, 16000],
data: [/* 1s for real samples, 0s for padding */],
},
},
});
const logits = outputs.logits;
console.log(logits.dims, logits.data.length);
await CapacitorOnnx.release({ modelId: 'demo-model', version: '1.0.0' });
loadModel supports optional warmupInputs: Record<string, RawTensor> to pre-run the session with sample tensors keyed by model input name (e.g. { input_values: { type: 'float32', dims: [1, 16000], data: [...] } }). Warmup is skipped when warmupInputs is omitted.loadModel returns executionProviderUsed with the provider that was actually initialized.sessionOptions.executionProvider with auto, wasm, webgpu, webnn plus native aliases (cpu/nnapi/coreml mapped to wasm in Web).auto mode, provider resolution tries accelerated providers first (webgpu, webnn) and falls back to wasm.cpu → CPU, nnapi/coreml → CoreML, auto → CoreML with CPU fallback, web providers (wasm/webgpu/webnn) → CPU.run takes inputs keyed by ONNX input name and returns every model output in outputs keyed by ONNX output name. Android accepts float32, int64, int32, bool, uint8; iOS accepts the same set except bool (the ONNX Runtime Obj-C API exposes no bool tensor type). float16/uint32 are web-only. Unsupported types are rejected on native with a structured error.RunResult.outputs tensor carries the shape and dtype ORT materialized — Web reads ort.Tensor.dims/.type, Android reads OnnxTensor.info.shape/.type, iOS reads tensorTypeAndShapeInfo().shape/.elementType. No heuristic, no symbolic dims (-1) in the result.code, message, retryable, correlationId, details)..onnx vs reduced .ort (Android)There are two ways to ship a model, and you pick per app:
.onnx (Android, iOS, Web)The standard path used in all the examples above: the plugin bundles the full
onnxruntime-android AAR (and onnxruntime-objc/onnxruntime-web on the other
platforms), and you load a plain .onnx model. Works everywhere, no extra tooling
or setup — just loadModel. This is the default; if you do nothing, you get this.
.ort (Android only, opt-in)For Android you can shrink the native runtime by compiling a libonnxruntime.so with
only your model's operators (~57–59% smaller on arm64) and loading a pre-optimized
.ort model instead of the .onnx. It is opt-in and Android-only; iOS/Web keep
the default path.
# in your app package (depends on @cantoo/capacitor-onnx)
pnpm exec cantoo-onnx-reduce # or: npx cantoo-onnx-reduce
The first build (the "generator") needs a toolchain (Python ≥3.10 + onnxruntime/onnx,
Android NDK, CMake/Ninja, JDK 21; bash — on Windows use WSL2) and a small amount of app-side
wiring (load the .ort, whose filename must end in .ort). Once the generator publishes the
op-config, .ort and AAR (via the onnx*UploadUrl keys), other devs/CI need none of that
toolchain — with onnxConfigUrl + onnxCacheUrl set, their build just downloads the
op-config and the prebuilt AAR (no Python, no NDK, no model download). It stays fully opt-out —
clearing onnxModel reverts to the full AAR + .onnx.
Full guide: docs/reduced-onnx.md.
MIT
FAQs
Capacitor plugin for native ONNX Runtime inference on Android, iOS and Web
The npm package @cantoo/capacitor-onnx receives a total of 29 weekly downloads. As such, @cantoo/capacitor-onnx popularity was classified as not popular.
We found that @cantoo/capacitor-onnx demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 6 open source maintainers collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Product
Socket now supports Custom Roles and Repository Access Permissions so organizations can control who can access specific repositories and actions.

Product
Socket MCP now lets AI assistants review org alerts, investigate threats using the Socket threat feed, and inspect package files in addition to dependency scoring.

Product
Socket Firewall blocks malicious VS Code and Open VSX extensions before install, protecting developers from compromised editor marketplaces.