Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More →

@pr0gramm/fluester

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

@pr0gramm/fluester

Node.js bindings for OpenAI's Whisper. Optimized for CPU.

0.9.13
latest
Source
npm

Version published: last week

Maintainers: 0

Created: last year

Source

fluester – [ˈflʏstɐ]

Node.js bindings for OpenAI's Whisper. Hard-fork of whisper-node.

Features

Output transcripts to JSON (also .txt .srt .vtt)
Optimized for CPU (Including Apple Silicon ARM)
Timestamp precision to single word

Installation

Requirements

make and everything else listed as required to compile whisper.cpp
Node.js >= 20

Add dependency to project

npm install @pr0gramm/fluester

Download whisper model of choice

npx --package @pr0gramm/fluester download-model

Compile whisper.cpp if you don't want to provide you own version:

npx --package @pr0gramm/fluester compile-whisper

Usage

Important: The API only supports WAV files (just like the original whisper.cpp). You need to convert any files to a supported format before. You can do this using ffmpeg (example taken from the whisper project):

ffmpeg -i input.mp3 -ar 16000 -ac 1 -c:a pcm_s16le output.wav

OR Use the provided helper to convert the audio file:

import { convertFileToProcessableFile } from "@pr0gramm/fluester";

const inputFile = "input.mp3";
const outputFile = "output.wav";
await convertFileToProcessableFile(inputFile, outputFile);

Translation

import { createWhisperClient } from "@pr0gramm/fluester";

const client = createWhisperClient({
  modelName: "base",
});

const transcript = await client.translate("example/sample.wav");

console.log(transcript); // output: [ {start,end,speech} ]

Output (JSON)

[
  {
    "start": "00:00:14.310", // timestamp start
    "end": "00:00:16.480", // timestamp end
    "speech": "howdy" // transcription
  }
]

Language Detection

import { createWhisperClient } from "@pr0gramm/fluester";

const client = createWhisperClient({
  modelName: "base",
});

const result = await client.detectLanguage("example/sample.wav");
if(!result) {
  console.log(`Detected: ${result.language} with probability ${result.probability}`);
} else {
  console.log("Did not detect anything :(");
}

Tricks

This library is designed to work well in dockerized environments.

We took time and made some steps independent from each other, so they can be used in a multi-stage docker build.

FROM node:latest as dependencies
    WORKDIR /app
    COPY package.json package-lock.json ./
    RUN npm ci

    RUN npx --package @pr0gramm/fluester compile-whisper
    RUN npx --package @pr0gramm/fluester download-model tiny

FROM node:latest
    WORKDIR /app
    COPY --from=dependencies /app/node_modules /app/node_modules
    COPY ./ ./

This includes the model in the image. If you want to keep your image small, you can also download the model in your entrypoint using the commands above.

Made with

A lot of love by @ariym at whisper-node
Whisper OpenAI (using C++ port by: ggerganov)

Roadmap

Nothing ¯\_(ツ)_/¯

Keywords

FAQs

What is @pr0gramm/fluester?

Is @pr0gramm/fluester well maintained?

Package last updated on 13 Nov 2024

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install