Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement
Sign In

tauri-plugin-stt-api

Package Overview
Dependencies
Maintainers
1
Versions
3
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

tauri-plugin-stt-api

Speech-to-text recognition API for Tauri with multi-language support

latest
Source
npmnpm
Version
0.2.0
Version published
Maintainers
1
Created
Source

Tauri Plugin STT (Speech-to-Text)

Cross-platform speech recognition plugin for Tauri 2.x. Desktop targets (Windows, macOS, Linux) use whisper.cpp via whisper-rs; mobile targets delegate to the native OS engines (SFSpeechRecognizer on iOS, SpeechRecognizer on Android).

Highlights

  • One model, 99 languages — Whisper is multilingual; users download a single GGML model and it works for English, Portuguese, Mandarin, …
  • No native runtime to shipwhisper-rs builds whisper.cpp statically; there is no libvosk.so / .dylib to install separately.
  • Explicit model lifecycle — the host app controls when (and whether) a model is downloaded. start_listening returns ModelNotInstalled instead of silently pulling hundreds of MB.
  • Hardware acceleration — opt-in metal / cuda / vulkan features map straight to the matching whisper.cpp backend.

Platform Matrix

PlatformEngineModel
iOSSFSpeechRecognizer (Speech.framework)OS
AndroidSpeechRecognizerOS
macOSwhisper.cpp via whisper-rs (Metal opt.)GGML
Windowswhisper.cpp via whisper-rs (CUDA opt.)GGML
Linuxwhisper.cpp via whisper-rs (Vulkan opt.)GGML

Installation

[dependencies]
tauri-plugin-stt = { version = "0.2", features = ["metal"] } # macOS
# or "cuda" / "vulkan" — omit for plain CPU inference

Register the plugin and the four model-management commands:

fn main() {
    tauri::Builder::default()
        .plugin(tauri_plugin_stt::init())
        .run(tauri::generate_context!())
        .unwrap();
}

Capability:

{ "permissions": ["stt:default"] }

Model Catalogue

iddisplaysizetier
tinyTiny75 MBfastest
baseBase142 MBbalanced ⭐
smallSmall466 MBaccurate
mediumMedium1.5 GBvery accurate
large-v3Large v33.0 GBmost accurate

Files are fetched from https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-<id>.bin and stored under <app_data_dir>/whisper-models/. The active selection is persisted to whisper-models/active.txt.

Commands

  • list_models(){ models, active, total_disk_bytes }
  • install_model(id) — downloads the model, emits stt://download-progress
  • remove_model(id) — deletes the file and clears the active marker if needed
  • set_active_model(id) — picks which installed model start_listening loads
  • start_listening({ language?, max_duration? }) — push-to-talk session
  • stop_listening() — runs Whisper over the captured audio and emits one final result
  • is_available() — reports available: true only when a model is installed
  • get_supported_languages() — curated list of UI-facing locales
  • check_permission() / request_permission() — microphone permission helpers

Events

  • stt://download-progress{ status, modelId, model, progress, downloaded?, total? }
  • stt://result{ transcript, isFinal, confidence }
  • stt://error{ code, message }
  • plugin:stt:result — same payload as stt://result (legacy listener channel)
  • plugin:stt:stateChange{ state, isAvailable, language }

Behaviour Notes

  • Whisper is not a streaming recogniser. The plugin buffers audio while recording and runs a single pass on stop_listening. UX is push-to-talk.
  • Audio is captured at the device default rate, downmixed to mono, then decimated to 16 kHz with nearest-neighbour. Whisper is robust enough that a high-quality resampler buys nothing measurable.
  • Inference uses min(available_parallelism(), 4) threads — beyond that whisper.cpp shows diminishing returns and we want headroom for the UI.

Mobile

The mobile bridges expose the same JS API surface but list_models returns an empty list and install_model / remove_model / set_active_model are no-ops: the OS engine has no downloadable model concept. Use is_available to gate UI; on iOS / Android it reflects actual recognizer availability.

License

MIT.

Keywords

tauri

FAQs

Package last updated on 11 May 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts