![Oracle Drags Its Feet in the JavaScript Trademark Dispute](https://cdn.sanity.io/images/cgdhsj6q/production/919c3b22c24f93884c548d60cbb338e819ff2435-1024x1024.webp?w=400&fit=max&auto=format)
Security News
Oracle Drags Its Feet in the JavaScript Trademark Dispute
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
@autocode2/speech-to-text
Advanced tools
A Node.js library and CLI tool for converting speech to text using sox for audio recording and Google's Gemini API for transcription.
sox
command line utility installed on your system
brew install sox
apt-get install sox
The quickest way to use the tool is via npx
:
npx @autocode2/speech-to-text --api-key YOUR_API_KEY
If you plan to use the tool frequently, you can install it globally:
npm install -g @autocode2/speech-to-text
Then use it directly:
speech-to-text --api-key YOUR_API_KEY
For use in a project:
npm install @autocode2/speech-to-text
npx @autocode2/speech-to-text --api-key YOUR_API_KEY [options]
-k, --api-key
: Google API Key for Gemini (required)-i, --input
: Input audio file to transcribe (if not provided, will record from microphone)-o, --output
: Output file to save the recording (only applies when recording from microphone)-r, --sample-rate
: Sample rate for recording in Hz (default: 16000)-c, --channels
: Number of audio channels (default: 1)-m, --model
: Gemini model to use (default: "gemini-1.5-flash")-p, --prompt
: Custom prompt for transcription-f, --format
: Output format (text|json, defaults to text in terminal, json in pipe)-h, --help
: Show help-v, --version
: Show version number# Record from microphone and transcribe (uses temporary file)
npx @autocode2/speech-to-text --api-key YOUR_API_KEY
# Record, save to file, and transcribe
npx @autocode2/speech-to-text --api-key YOUR_API_KEY -o recording.wav
# Transcribe existing file
npx @autocode2/speech-to-text --api-key YOUR_API_KEY -i existing.wav
# Record in high quality
npx @autocode2/speech-to-text --api-key YOUR_API_KEY -r 44100 -c 2 -o high-quality.wav
# Use custom transcription prompt
npx @autocode2/speech-to-text --api-key YOUR_API_KEY -p "Provide a detailed transcription with punctuation"
# Output in JSON format
npx @autocode2/speech-to-text --api-key YOUR_API_KEY --format json > output.json
# Pipe transcription to other tools
npx @autocode2/speech-to-text --api-key YOUR_API_KEY | jq .text
When using JSON output (either explicitly with --format json
or implicitly when piping), the output will be a JSON object with the following structure:
{
"text": "The transcribed text",
"timestamp": "2024-01-20T12:34:56.789Z",
"input": "input-file.wav", // If provided
"output": "output-file.wav", // If provided
"sampleRate": 16000, // If recording
"channels": 1, // If recording
"model": "gemini-1.5-flash" // If specified
}
You can also use this as a library in your Node.js projects:
import { SpeechToText } from "@autocode2/speech-to-text";
const stt = new SpeechToText({
apiKey: "your-google-api-key",
recording: {
sampleRate: 16000,
channels: 1,
},
transcription: {
model: "gemini-1.5-flash",
prompt: "Custom transcription prompt",
},
});
// Record to temporary file (automatically cleaned up)
const text1 = await stt.recordAndTranscribe();
// Record and save to file
const text2 = await stt.recordAndTranscribe("output.wav");
// Transcribe existing file
const text3 = await stt.transcribe("existing.wav");
ISC
Gareth Andrew
FAQs
A speech-to-text library for node
The npm package @autocode2/speech-to-text receives a total of 3 weekly downloads. As such, @autocode2/speech-to-text popularity was classified as not popular.
We found that @autocode2/speech-to-text demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 0 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
Security News
The Linux Foundation is warning open source developers that compliance with global sanctions is mandatory, highlighting legal risks and restrictions on contributions.
Security News
Maven Central now validates Sigstore signatures, making it easier for developers to verify the provenance of Java packages.