
Research
/Security News
Critical Vulnerability in NestJS Devtools: Localhost RCE via Sandbox Escape
A flawed sandbox in @nestjs/devtools-integration lets attackers run code on your machine via CSRF, leading to full Remote Code Execution (RCE).
youtube-transcript-js-api
Advanced tools
A JavaScript/TypeScript library to fetch YouTube video transcripts
A comprehensive TypeScript/JavaScript library to fetch YouTube video transcripts with advanced rate limiting and multiple output formats. This is a complete port of the popular Python library youtube-transcript-api
with significant enhancements including intelligent rate limiting, user agent rotation, and extensive TypeScript support.
npm install youtube-transcript-js-api
For CLI usage, install globally:
npm install -g youtube-transcript-js-api
Requirements: Node.js 18.0.0 or higher
import { getTranscript, YouTubeTranscriptApi } from 'youtube-transcript-js-api';
// Simple usage - get any available transcript
const transcript = await getTranscript('https://www.youtube.com/watch?v=dQw4w9WgXcQ');
console.log(transcript);
// Using the API class for language-specific requests
const api = new YouTubeTranscriptApi();
const englishTranscript = await api.getTranscript('https://www.youtube.com/watch?v=dQw4w9WgXcQ', ['en']);
CLI Usage:
# Get transcript from command line
youtube-transcript get "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
# Save as SRT file
youtube-transcript get "VIDEO_URL" --format srt --output subtitles.srt
getTranscript(videoUrl, config?)
Fetch transcript for a YouTube video.
import { getTranscript } from 'youtube-transcript-js-api';
// Get any available transcript
const transcript = await getTranscript('https://www.youtube.com/watch?v=dQw4w9WgXcQ');
// Get transcript with custom configuration
const transcript = await getTranscript(
'https://www.youtube.com/watch?v=dQw4w9WgXcQ',
{
timeout: 15000,
rateLimit: { enabled: true }
}
);
Note: For language-specific requests, use the YouTubeTranscriptApi
class which provides more control over language selection.
listTranscripts(videoUrl)
List all available transcripts for a video.
import { listTranscripts } from 'youtube-transcript-js-api';
const transcriptList = await listTranscripts('https://www.youtube.com/watch?v=dQw4w9WgXcQ');
console.log(transcriptList.transcripts);
// [
// {
// language: 'en',
// languageName: 'English',
// isGenerated: false,
// isTranslatable: true,
// url: '...'
// },
// ...
// ]
import { YouTubeTranscriptApi } from 'youtube-transcript-js-api';
const api = new YouTubeTranscriptApi({
userAgent: 'Custom User Agent',
timeout: 15000,
headers: {
'Custom-Header': 'value'
},
rateLimit: {
enabled: true,
maxConcurrentRequests: 2
}
});
// Get transcript (any available language)
const transcript = await api.getTranscript('VIDEO_URL');
// Get transcript in specific language(s)
const transcript = await api.getTranscript('VIDEO_URL', ['en', 'es', 'fr']);
// Get transcript in specific language
const transcript = await api.getTranscriptByLanguage('VIDEO_URL', 'es');
// Get translated transcript
const transcript = await api.getTranslatedTranscript('VIDEO_URL', 'fr', 'en');
// List available transcripts
const transcriptList = await api.listTranscripts('VIDEO_URL');
import {
getTranscript,
TextFormatter,
SRTFormatter,
VTTFormatter,
JSONFormatter,
TimestampFormatter
} from 'youtube-transcript-js-api';
const transcript = await getTranscript('VIDEO_URL');
// Plain text
const textFormatter = new TextFormatter();
const plainText = textFormatter.format(transcript);
// SRT format
const srtFormatter = new SRTFormatter();
const srtContent = srtFormatter.format(transcript);
// VTT format
const vttFormatter = new VTTFormatter();
const vttContent = vttFormatter.format(transcript);
// JSON format
const jsonFormatter = new JSONFormatter(true); // pretty print
const jsonContent = jsonFormatter.format(transcript);
// With timestamps
const timestampFormatter = new TimestampFormatter(true); // include end time
const timestampContent = timestampFormatter.format(transcript);
import {
getTranscript,
VideoUnavailableError,
TranscriptsDisabledError,
NoTranscriptFoundError,
TranscriptRetrievalError,
TooManyRequestsError
} from 'youtube-transcript-js-api';
try {
const transcript = await getTranscript('VIDEO_URL');
} catch (error) {
if (error instanceof VideoUnavailableError) {
console.log('Video is private, deleted, or does not exist');
} else if (error instanceof TranscriptsDisabledError) {
console.log('Transcripts are disabled for this video');
} else if (error instanceof NoTranscriptFoundError) {
console.log('No transcript found in requested language');
console.log('Available languages:', error.availableLanguages);
} else if (error instanceof TooManyRequestsError) {
console.log('Rate limited - wait before making more requests');
} else if (error instanceof TranscriptRetrievalError) {
console.log('Failed to retrieve transcript:', error.message);
}
}
The library supports various YouTube URL formats:
https://www.youtube.com/watch?v=VIDEO_ID
https://youtu.be/VIDEO_ID
https://www.youtube.com/embed/VIDEO_ID
https://www.youtube.com/v/VIDEO_ID
VIDEO_ID
The library supports all languages that YouTube supports for transcripts. Common language codes include:
en
- Englishtr
- Turkishes
- Spanishfr
- Frenchde
- Germanit
- Italianpt
- Portugueseru
- Russianja
- Japaneseko
- Koreanzh-Hans
- Chinese (Simplified)zh-Hant
- Chinese (Traditional)And many more...
The library includes a command-line interface for easy transcript fetching from the terminal.
npm install -g youtube-transcript-js-api
# Get transcript in any available language
youtube-transcript get "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
# Get transcript in specific language
youtube-transcript get "https://www.youtube.com/watch?v=dQw4w9WgXcQ" --language en
# Save transcript to file
youtube-transcript get "VIDEO_URL" --output transcript.txt
# Get transcript in SRT format
youtube-transcript get "VIDEO_URL" --format srt --output subtitles.srt
# Get transcript in VTT format
youtube-transcript get "VIDEO_URL" --format vtt --output subtitles.vtt
# Get transcript with timestamps
youtube-transcript get "VIDEO_URL" --format timestamp
# Get transcript in JSON format
youtube-transcript get "VIDEO_URL" --format json --output transcript.json
# Translate transcript to another language
youtube-transcript get "VIDEO_URL" --translate es --output transcript_spanish.txt
# Translate from specific source language
youtube-transcript get "VIDEO_URL" --language en --translate fr --output transcript_french.txt
# List all available transcripts for a video
youtube-transcript list "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
# Output example:
# Video ID: dQw4w9WgXcQ
# Available transcripts:
# 1. English (en) (auto-generated) [translatable]
# 2. Spanish (es) (manual) [translatable]
# 3. French (fr) (manual)
Option | Short | Description | Default |
---|---|---|---|
--language | -l | Specific language code (e.g., en, es, fr) | Any available |
--format | -f | Output format (text, srt, vtt, json, timestamp) | text |
--output | -o | Output file (defaults to stdout) | stdout |
--translate | -t | Translate to target language | None |
# Basic usage - get any available transcript
youtube-transcript get "https://youtu.be/dQw4w9WgXcQ"
# Get English transcript and save as SRT
youtube-transcript get "dQw4w9WgXcQ" -l en -f srt -o subtitles.srt
# Translate English transcript to Spanish
youtube-transcript get "https://www.youtube.com/watch?v=dQw4w9WgXcQ" -l en -t es -o spanish.txt
# Get transcript with timestamps
youtube-transcript get "VIDEO_URL" -f timestamp
# List available languages
youtube-transcript list "VIDEO_URL"
# Pipe transcript to other tools
youtube-transcript get "VIDEO_URL" | grep "important keyword"
# Save JSON transcript for processing
youtube-transcript get "VIDEO_URL" -f json -o data.json
interface TranscriptConfig {
userAgent?: string; // Custom user agent
timeout?: number; // Request timeout in milliseconds
cookies?: string; // Custom cookies
headers?: Record<string, string>; // Custom headers
rateLimit?: RateLimitConfig; // Rate limiting configuration
}
interface RateLimitConfig {
enabled?: boolean; // Enable rate limiting (default: true)
maxConcurrentRequests?: number; // Max concurrent requests (default: 1)
baseDelay?: { min: number; max: number }; // Base delay range in ms
maxRetries?: number; // Maximum retry attempts (default: 3)
retryBaseDelay?: number; // Base delay for exponential backoff (default: 1000ms)
maxBackoffDelay?: number; // Maximum backoff delay (default: 30000ms)
adaptiveDelays?: boolean; // Enable adaptive delays (default: true)
userAgentRotationInterval?: number; // User agent rotation interval (default: 10)
}
The library includes comprehensive protection against YouTube's rate limiting with intelligent request handling.
import { YouTubeTranscriptApi } from 'youtube-transcript-js-api';
// Uses default rate limiting settings
const api = new YouTubeTranscriptApi();
const transcript = await api.getTranscript('VIDEO_URL');
import { YouTubeTranscriptApi, RateLimitConfig } from 'youtube-transcript-js-api';
const rateLimitConfig: RateLimitConfig = {
enabled: true,
maxConcurrentRequests: 2, // Allow 2 concurrent requests
baseDelay: { min: 500, max: 1500 }, // 500-1500ms random delays
maxRetries: 5, // Retry up to 5 times
retryBaseDelay: 2000, // Start with 2s backoff
maxBackoffDelay: 60000, // Cap backoff at 60s
adaptiveDelays: true, // Enable adaptive delays
userAgentRotationInterval: 5 // Rotate user agent every 5 requests
};
const api = new YouTubeTranscriptApi({
rateLimit: rateLimitConfig
});
const batchApi = new YouTubeTranscriptApi({
rateLimit: {
maxConcurrentRequests: 1, // Process one at a time
baseDelay: { min: 1000, max: 2000 }, // Longer delays for batch processing
adaptiveDelays: true
}
});
const videoIds = ['video1', 'video2', 'video3'];
for (const videoId of videoIds) {
try {
const transcript = await batchApi.getTranscript(videoId);
console.log(`Processed ${videoId}: ${transcript.length} entries`);
} catch (error) {
console.error(`Failed to process ${videoId}:`, error.message);
}
}
const api = new YouTubeTranscriptApi();
// Get rate limiting statistics
const status = api.getRateLimitStatus();
console.log('Queue length:', status.queueStatus.queueLength);
console.log('Active requests:', status.queueStatus.activeRequests);
console.log('Average response time:', status.queueStatus.averageResponseTime);
console.log('Total requests made:', status.requestCount);
import { RateLimiter } from 'youtube-transcript-js-api';
const rateLimiter = new RateLimiter({
maxConcurrentRequests: 3,
baseDelay: { min: 200, max: 800 },
adaptiveDelays: true
});
// Use with any async function
const result = await rateLimiter.execute(async () => {
// Your API call here
return await someApiCall();
});
The rate limiter intelligently detects various types of rate limiting:
The system automatically adapts to changing conditions:
For more advanced examples, see examples/advanced-rate-limiting.ts.
maxConcurrentRequests: 1
import { YouTubeTranscriptApi } from 'youtube-transcript-js-api';
const api = new YouTubeTranscriptApi({
rateLimit: {
enabled: true,
maxConcurrentRequests: 1,
baseDelay: { min: 1000, max: 2000 },
adaptiveDelays: true
}
});
const videoIds = ['video1', 'video2', 'video3'];
const results = [];
for (const videoId of videoIds) {
try {
const transcript = await api.getTranscript(videoId, ['en', 'es']);
results.push({ videoId, transcript, success: true });
// Optional: Add your own delay between requests
await new Promise(resolve => setTimeout(resolve, 500));
} catch (error) {
results.push({ videoId, error: error.message, success: false });
}
}
import fs from 'fs';
import { getTranscript, SRTFormatter } from 'youtube-transcript-js-api';
const transcript = await getTranscript('VIDEO_URL');
const srtFormatter = new SRTFormatter();
const srtContent = srtFormatter.format(transcript);
fs.writeFileSync('transcript.srt', srtContent);
import { getTranscript, TimestampFormatter } from 'youtube-transcript-js-api';
const transcript = await getTranscript('VIDEO_URL');
const formatter = new TimestampFormatter(true); // include end time
const formattedTranscript = formatter.format(transcript);
console.log(formattedTranscript);
// [0:00 - 0:03] Hello and welcome to this video
// [0:03 - 0:06] Today we're going to learn about...
import { YouTubeTranscriptApi } from 'youtube-transcript-js-api';
const api = new YouTubeTranscriptApi();
// Translate English transcript to Spanish
const transcript = await api.getTranslatedTranscript('VIDEO_URL', 'es', 'en');
The library includes a comprehensive test suite covering all functionality:
# Run all tests
npm test
# Run tests in watch mode
npm run test:watch
# Run tests with coverage
npm run test:coverage
The test suite includes:
YouTubeTranscriptError
- Base error classVideoUnavailableError
- Video is private, deleted, or doesn't existTranscriptsDisabledError
- Transcripts are disabled for the videoNoTranscriptFoundError
- No transcript found in requested languageTranscriptRetrievalError
- Failed to retrieve transcript dataTranslationError
- Failed to translate transcriptTooManyRequestsError
- Rate limited by YouTubeContributions are welcome! Please feel free to submit a Pull Request.
MIT License - see the LICENSE file for details.
This library is for educational and research purposes only. Please respect YouTube's Terms of Service and use this library responsibly.
FAQs
A JavaScript/TypeScript library to fetch YouTube video transcripts with ultra-robust 11-strategy fallback system, 100% success rate when transcripts are available, and intelligent language validation.
The npm package youtube-transcript-js-api receives a total of 38 weekly downloads. As such, youtube-transcript-js-api popularity was classified as not popular.
We found that youtube-transcript-js-api demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
/Security News
A flawed sandbox in @nestjs/devtools-integration lets attackers run code on your machine via CSRF, leading to full Remote Code Execution (RCE).
Product
Customize license detection with Socket’s new license overlays: gain control, reduce noise, and handle edge cases with precision.
Product
Socket now supports Rust and Cargo, offering package search for all users and experimental SBOM generation for enterprise projects.