šŸš€ Big News: Socket Acquires Coana to Bring Reachability Analysis to Every Appsec Team.Learn more →
Socket
Book a DemoInstallSign in
Socket

audio-transcripter

Package Overview
Dependencies
Maintainers
1
Versions
3
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

audio-transcripter

Lightweight TypeScript library for transcribing audio files using Google Gemini 2.0 models. Supports local files, remote URLs, and Blobs.

2.0.1
latest
Source
npm
Version published
Weekly downloads
35
-83.09%
Maintainers
1
Weekly downloads
Ā 
Created
Source

šŸŽ™ļø audio-transcripter

npm version license downloads

A lightweight TypeScript library for transcribing audio files using Google Gemini 2.0 models.

Supports local files, remote URLs, and in-memory buffers/blobs.

Ideal for meetings, interviews, podcasts, technical content, and more.

šŸš€ Installation

npm install audio-transcripter

🌟 Features

  • šŸŽ§ Supports local files (.wav, .mp3, .aac, .flac, .ogg, .webm, etc.)

  • 🌐 Supports remote URLs (HTTP/HTTPS)

  • šŸ“¦ Supports Blobs / Buffers

  • ✨ Multiple transcription styles:

    • accurate
    • clean
    • structured
    • technical
    • conversational
  • šŸ” Verbose logging (optional)

  • āš™ļø Written in TypeScript with full type safety

šŸ§‘ā€šŸ’» Usage

1ļøāƒ£ Transcribe Local File

import { runTranscription } from "audio-transcripter";

const result = await runTranscription({
	audioFile: "./assets/audio.webm",
	style: "structured", // optional, default: 'conversational'
	language: "english", // optional
});

if (result.success) {
	console.log("Transcription:", result.transcription);
} else {
	console.error("Error:", result.error);
}

2ļøāƒ£ Transcribe Remote URL

const result = await runTranscription({
	audioFile: "https://example.com/audio.mp3",
	style: "clean",
	language: "english",
});

3ļøāƒ£ Transcribe Blob / Buffer (for browser or Node.js)

import { runTranscriptionWithBlob } from "audio-transcripter";

// Example with a Node.js Buffer
const fs = await import("fs/promises");
const audioBuffer = await fs.readFile("./assets/audio.wav");

const result = await runTranscriptionWithBlob(audioBuffer, {
	style: "technical",
	language: "english",
});

if (result.success) {
	console.log("Transcription:", result.transcription);
} else {
	console.error("Error:", result.error);
}

šŸ“„ Configuration Options

OptionTypeDefaultDescription
audioFilestringrequiredLocal file path or remote URL
stylestring'conversational'Transcription style (see below)
languagestring'english'Language of the audio
verbosebooleantrueEnable verbose console logs
timeoutnumber5000 (ms)Timeout for remote URL HEAD check (if applicable)

šŸŽØ Supported Transcription Styles

StyleDescription
accurateHigh accuracy, raw transcription including filler words
cleanEdited for readability (filler words removed, grammar fixed)
structuredMeeting/interview format with speakers and structure
technicalTechnical content with jargon preserved
conversationalCasual, creative, natural conversation transcription

šŸ—‚ļø Supported File Formats

  • .mp3
  • .wav
  • .aac
  • .flac
  • .ogg
  • .webm / .weba

Unknown formats fallback to audio/octet-stream.

šŸ“š API Reference

runTranscription(config: TranscriptionConfig)

Runs transcription on local file path or remote URL.

Returns: Promise<RunTranscriptionResult>

type RunTranscriptionResult = {
	success: boolean;
	transcription?: string;
	error?: string;
};

runTranscriptionWithBlob(audioBlob: Blob | Buffer, options?)

Runs transcription on an in-memory Blob or Node.js Buffer.

Returns: Promise<RunTranscriptionResult>

šŸ—‚ļø Type Definitions

export type TranscriptionStyle =
	| "accurate"
	| "clean"
	| "structured"
	| "technical"
	| "conversational";

export interface TranscriptionConfig {
	audioFile: string;
	style?: TranscriptionStyle;
	language?: string | null;
	verbose?: boolean;
	timeout?: number;
}

export interface RunTranscriptionResult {
	success: boolean;
	transcription?: string;
	error?: string;
}

šŸ” Authentication

This package requires a Gemini API Key.

1ļøāƒ£ Set TRANSCRIBER_KEY in your environment:

export TRANSCRIBER_KEY=your-gemini-api-key-here

or

2ļøāƒ£ Create a .env file:

TRANSCRIBER_KEY=your-gemini-api-key-here

Get your API key from Google MakerSuite.

šŸ› ļø Tech Stack

šŸ“„ License

MIT License Ā© 2025 Shriansh Agarwal

šŸ™‹ FAQ

Q: Does this upload my file to third-party storage?

A: No. Files are uploaded only to Gemini's File API endpoint.

Q: Can I use this in the browser?

A: runTranscriptionWithBlob works with browser Blob and Node.js Buffer.

Q: What models are used?

A: gemini-2.0-flash model via Google GenAI SDK.

Summary

āœ… Lightweight
āœ… Flexible API
āœ… Multiple transcription styles
āœ… Works with Files, URLs, Blobs/Buffer
āœ… Production-ready TypeScript types

Keywords

gemini

FAQs

Package last updated on 09 Jun 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts