Socket
Book a DemoInstallSign in
Socket

whisper-models

Package Overview
Dependencies
Maintainers
0
Versions
23
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

whisper-models

Simple package to download and/or use whisper models in your project, wether for transcription, translation, or any other purpose.

latest
Source
npmnpm
Version
1.0.22
Version published
Maintainers
0
Created
Source

Whisper Models

Simple package to download and/or use whisper models in your project, wether for transcription, translation, or any other purpose.

ModelDiskRAM
tiny75 MB~390 MB
tiny.en75 MB~390 MB
base142 MB~500 MB
base.en142 MB~500 MB
small466 MB~1.0 GB
small.en466 MB~1.0 GB
medium1.5 GB~2.6 GB
medium.en1.5 GB~2.6 GB
large-v12.9 GB~4.7 GB
large-v22.9 GB~4.7 GB
large-v32.9 GB~4.7 GB

Usage

Install the package using your package manager of choice:

npm install whisper-models
yarn add whisper-models
pnpm add whisper-models

and also add the following line to the scripts object of the package.json depending on the package manager you are using and the model you want to download:

{
  "scripts": {
    "postinstall": "pnpm whisper-models -m small"
  }
}

Transcription

// import whisper from 'whisper-models';
const Whisper = require('whisper-models');

(async () => {
  const whisper = new Whisper('tiny');
	await whisper.run();

  const transcription = await whisper.sendData('path/to/audio/file.wav');
  console.log(transcription);

  // or if you already know the spoken language

  const transcription = await whisper.sendData('path/to/audio/file.wav', { spokenLanguage: 'en' });
  console.log(transcription);
})();

Translation

// import whisper from 'whisper-models';
const Whisper = require('whisper-models');

(async () => {
  const whisper = new Whisper('tiny');
  await whisper.run();

  const translation = await whisper.sendData('path/to/audio/file.wav', { task: 'translate' });
  console.log(translation);
})();

Options

  • task: The task to perform. Default is transcribe.
  • spokenLanguage: The language spoken in the audio file. Default is en.
  • beamSize: The beam size. Default is 5.
  • temperature: The sampling temperature (between 0 and 1). Default is 0.
  • patience: The patience for early stopping.
  • maxSegmentLength: The maximum segment length. Default is 0.
  • compressionRatioThreshold: The compression ratio threshold.
  • cuda: The Nvidia CUDA device to use. Default is false.

FAQs

Package last updated on 03 Aug 2024

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts