New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details
Socket
Book a DemoSign in
Socket

@qvac/onnx

Package Overview
Dependencies
Maintainers
3
Versions
11
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@qvac/onnx

Bare addon for ONNX Runtime session management

latest
Source
npmnpm
Version
0.14.0
Version published
Maintainers
3
Created
Source

@qvac/onnx

Bare addon providing ONNX Runtime session management for QVAC inference addons. Links ONNX Runtime (via vcpkg) and exposes both a C++ header-only library and a JavaScript API. On desktop platforms, @qvac/onnx.bare is the single shared library containing ORT — consumer addons dynamically link against it so ORT is loaded exactly once per process. On mobile (Android/iOS), consumer addons can statically link ORT via the bundled static libraries, or dynamically link (controlled by the MOBILE_DYNAMIC_LINK CMake option). Has no dependency on qvac-lib-inference-addon-cpp.

Architecture

┌─────────────────────────────────────────────────────┐
│  JS API  (binding.js → addon/binding.cpp)           │
│  configureEnvironment · getAvailableProviders        │
│  createSession · run · getInputInfo · getOutputInfo  │
│  destroySession                                      │
└──────────────────────┬──────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────┐
│  C++ Headers  (header-only, namespace onnx_addon)   │
│                                                     │
│  OnnxSession.hpp          Concrete session           │
│  IOnnxSession.hpp         Abstract interface (no ORT)│
│  OnnxRuntime.hpp          Process-wide Ort::Env      │
│  OnnxSessionOptionsBuilder.hpp  EP / thread config   │
│  OnnxConfig.hpp           SessionConfig, enums       │
│  OnnxTensor.hpp           TensorInfo, I/O tensors    │
│  OnnxTypeConversions.hpp  ORT ↔ addon type mapping   │
│  Logger.hpp               Logging (stdout or JS)     │
│  AndroidLog.hpp           Android logcat logging     │
└──────────────────────┬──────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────┐
│  ONNX Runtime  (via vcpkg)                           │
│  Desktop: exported from qvac__onnx.bare (shared)    │
│  Mobile:  dynamic (default) or static linking       │
│  + XNNPack EP · CoreML · NNAPI · DirectML           │
└─────────────────────────────────────────────────────┘

Key design points:

  • Single ORT load — On desktop, @qvac/onnx.bare exports OrtGetApiBase and EP registration symbols. Consumer addons dynamically link against it (DT_NEEDED: qvac__onnx@0.bare), so ORT is loaded exactly once per process via ELF SONAME deduplication.
  • Single Ort::EnvOnnxRuntime is a Meyers singleton. All sessions across all consumer addons in the same process share one environment.
  • Header-only C++ — Consumer addons include <qvac-onnx/OnnxSession.hpp> and link qvac-onnx::headers (compile-time headers only). ORT symbols are resolved at runtime from the shared .bare.
  • Abstract interfaceIOnnxSession lets consumers decouple pipeline code from ONNX Runtime headers.
  • Mobile linking — Controlled by MOBILE_DYNAMIC_LINK CMake option (default ON). When ON, mobile builds use the same dynamic linking as desktop. When OFF, consumer addons statically link via qvac-onnx::qvac-onnx-static (which transitively provides onnxruntime::onnxruntime_static).
  • Automatic fallback chain — Session construction retries on failure: requested config → without XNNPACK → CPU-only. This ensures models load even when an EP is unavailable.

JS API

configureEnvironment(config?)

Configures the process-wide ONNX Runtime environment. Must be called before the first createSession(). Calling after initialization throws.

FieldTypeDefaultDescription
loggingLevelstring"error""verbose", "info", "warning", "error", "fatal"
loggingIdstring"qvac-onnx"Identifier used in ORT log messages

getAvailableProviders() → string[]

Returns the list of execution providers compiled into this build (for example ["CPUExecutionProvider"] or ["CPUExecutionProvider", "XnnpackExecutionProvider"], depending on how the addon was built).

createSession(modelPath, config?) → handle

Creates an ONNX Runtime inference session for the given model file.

FieldTypeDefaultDescription
providerstring"auto_gpu""cpu", "auto_gpu", "nnapi", "coreml", "directml"
optimizationstring"extended""disable", "basic", "extended", "all"
intraOpThreadsnumber0Intra-op parallelism thread count (0 = auto)
interOpThreadsnumber0Inter-op parallelism thread count (0 = auto)
enableXnnpackbooleanfalseEnable XNNPack execution provider for CPU
enableMemoryPatternbooleantrueEnable memory pattern optimization
enableCpuMemArenabooleantrueEnable CPU memory arena
executionModestring"sequential""sequential" or "parallel"

getInputInfo(handle) → Array<{name, shape, type}>

Returns input tensor metadata for the session.

getOutputInfo(handle) → Array<{name, shape, type}>

Returns output tensor metadata for the session.

run(handle, inputs) → Array<{name, shape, type, data}>

Runs inference. Each input: {name: string, shape: number[], type: string, data: TypedArray}.

Supported tensor types and corresponding TypedArrays:

  • float32Float32Array
  • float16 → Special handling (no native JS TypedArray)
  • int64BigInt64Array
  • int32Int32Array
  • int8Int8Array
  • uint8Uint8Array

Returns array of outputs: {name: string, shape: number[], type: string, data: TypedArray} where data contains the inference results as the appropriate TypedArray type.

destroySession(handle)

Destroys the session and frees resources.

C++ API

All headers are under the qvac-onnx/ include prefix (source in src/qvac-onnx/).

Headers

HeaderDescription
OnnxSession.hppConcrete session — load model, run inference, inspect I/O, zero-copy runRaw()
IOnnxSession.hppAbstract interface (no ORT dependency) for virtual dispatch
OnnxRuntime.hppSingleton Ort::Env; configure() and getAvailableProviders()
OnnxConfig.hppSessionConfig, EnvironmentConfig, enums, providerToString(), optimizationToString()
OnnxTensor.hppTensorInfo, InputTensor, OutputTensor, TensorType, tensorTypeSize()
OnnxSessionOptionsBuilder.hppBuilds Ort::SessionOptions from SessionConfig
OnnxTypeConversions.hppMaps ORT element types to TensorType
Logger.hppLogging via stdout or JS (controlled by JS_LOGGER define)
AndroidLog.hppAndroid logcat logging (controlled by QVAC_ONNX_ENABLE_ANDROID_LOG define)

Configuration types

// Environment (process-wide, one-time)
onnx_addon::EnvironmentConfig envCfg;
envCfg.loggingLevel = onnx_addon::LoggingLevel::ERROR;   // VERBOSE, INFO, WARNING, ERROR, FATAL
envCfg.loggingId    = "my-addon";
onnx_addon::OnnxRuntime::configure(envCfg);               // before first session

// Session (per-model)
onnx_addon::SessionConfig config;
config.provider          = onnx_addon::ExecutionProvider::CPU;       // CPU, AUTO_GPU, NNAPI, CoreML, DirectML
config.optimization      = onnx_addon::GraphOptimizationLevel::ALL;  // DISABLE, BASIC, EXTENDED, ALL
config.intraOpThreads    = 4;
config.interOpThreads    = 2;
config.enableMemoryPattern = true;
config.enableCpuMemArena   = true;
config.enableXnnpack       = false;  // Default: false (must be explicitly enabled)
config.executionMode       = onnx_addon::ExecutionMode::SEQUENTIAL;  // SEQUENTIAL, PARALLEL

OnnxSession

Concrete session class inheriting from IOnnxSession. Header-only, requires ONNX Runtime linked by the consuming target. Non-copyable, movable.

#include <qvac-onnx/OnnxSession.hpp>

// Construction — loads model with automatic fallback chain:
//   1. Try with requested config (may include GPU EP + XNNPACK)
//   2. If XNNPACK enabled and init fails, retry without XNNPACK
//   3. If a non-CPU provider was requested and init fails, retry CPU-only
onnx_addon::OnnxSession session("model.onnx", config);

// Introspection
auto inputs  = session.getInputInfo();    // std::vector<TensorInfo>
auto outputs = session.getOutputInfo();   // std::vector<TensorInfo>
const std::string& name = session.inputName(0);   // cached, no ORT API call
const std::string& out  = session.outputName(0);  // cached, no ORT API call
bool valid = session.isValid();           // true if session loaded successfully
const std::string& path = session.modelPath();

// Run inference — returns deep-copied OutputTensors
auto results = session.run(input);                     // single input, all outputs
auto results = session.run(inputs);                    // multiple inputs, all outputs
auto results = session.run(inputs, outputNames);       // multiple inputs, specific outputs

// Run inference — returns raw Ort::Values (zero-copy, OnnxSession-only, not in IOnnxSession)
auto ortValues = session.runRaw(input);                // single input, all outputs
auto ortValues = session.runRaw(inputs);               // multiple inputs, all outputs
auto ortValues = session.runRaw(inputs, outputNames);  // multiple inputs, specific outputs

The runRaw() methods return std::vector<Ort::Value> directly from ORT, avoiding the deep copy that run() performs. Use runRaw() in performance-critical pipelines where you can work with ORT values directly.

IOnnxSession (abstract interface)

For consumers that want to decouple pipeline code from ONNX Runtime headers:

#include <qvac-onnx/IOnnxSession.hpp>  // No ORT dependency

class IOnnxSession {
  virtual std::vector<TensorInfo> getInputInfo() const = 0;
  virtual std::vector<TensorInfo> getOutputInfo() const = 0;
  virtual const std::string& inputName(size_t index) const = 0;
  virtual const std::string& outputName(size_t index) const = 0;
  virtual std::vector<OutputTensor> run(const InputTensor& input) = 0;
  virtual std::vector<OutputTensor> run(const std::vector<InputTensor>& inputs) = 0;
  virtual std::vector<OutputTensor> run(const std::vector<InputTensor>& inputs,
                                        const std::vector<std::string>& outputNames) = 0;
  virtual bool isValid() const = 0;
  virtual const std::string& modelPath() const = 0;
};

Tensor types

// TensorType enum: FLOAT32, FLOAT16, INT64, INT32, INT8, UINT8

struct TensorInfo { std::string name; std::vector<int64_t> shape; TensorType type; };

struct InputTensor {
  std::string name;
  std::vector<int64_t> shape;
  TensorType type = TensorType::FLOAT32;
  const void* data = nullptr;  // Caller owns memory
  size_t dataSize = 0;         // Size in bytes
};

struct OutputTensor {
  std::string name;
  std::vector<int64_t> shape;
  TensorType type;
  std::vector<uint8_t> data;   // Addon owns copy
  size_t elementCount() const;
  template<typename T> const T* as() const;      // Const typed access
  template<typename T> T* asMutable();            // Mutable typed access
};

size_t tensorTypeSize(TensorType type);  // FLOAT32=4, FLOAT16=2, INT64=8, INT32=4, INT8=1, UINT8=1

Quick example

#include <qvac-onnx/OnnxSession.hpp>

onnx_addon::SessionConfig config;
config.provider = onnx_addon::ExecutionProvider::CPU;

onnx_addon::OnnxSession session("model.onnx", config);

auto inputs  = session.getInputInfo();
auto outputs = session.getOutputInfo();

onnx_addon::InputTensor input;
input.name     = inputs[0].name;
input.shape    = {1, 3, 224, 224};
input.type     = onnx_addon::TensorType::FLOAT32;
input.data     = floatData.data();  // Replace with actual data pointer
input.dataSize = floatData.size() * sizeof(float);

auto results = session.run(input);
const float* out = results[0].as<float>();

Consumer Addon Integration

ONNX-based consumer addons (e.g. ocr-onnx, tts) get @qvac/onnx via npm. This single dependency provides the C++ headers, ONNX Runtime headers, CMake targets, and — on mobile (when static linking) — static libraries. On desktop, ORT symbols are resolved at runtime from the shared @qvac/onnx.bare (installed as a companion library). Consumer addons do not need onnxruntime in their own vcpkg.json.

See INTEGRATION.md for a step-by-step guide covering package.json, vcpkg.json, CMakeLists.txt, symbol visibility, and platform-specific setup — including a concrete walkthrough of how @qvac/ocr-onnx integrates with this package.

Building

npm run build          # bare-make generate && bare-make build && bare-make install

Running C++ tests

npm run test:cpp       # build with -D BUILD_TESTING=ON, then ctest

Two test binaries are produced:

  • unit_tests — pure C++ tests for tensor types, config, and the abstract interface (no ORT dependency)
  • ort_tests — tests that link ORT: runtime singleton, type conversions, session options, session lifecycle, and the shared-runtime addon scenario

Platform Support

PlatformExecution ProvidersTriplet
LinuxXNNPack, CPUx64-linux
macOSCoreML, XNNPack, CPUarm64-osx
WindowsDirectML, XNNPack, CPU(default MSVC)
AndroidNNAPI, XNNPack, CPUarm64-android
iOSCoreML, XNNPack, CPUarm64-ios
iOS SimCoreML, XNNPack, CPUarm64-ios-simulator, x64-ios-simulator

License

Apache-2.0

FAQs

Package last updated on 03 Apr 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts