
Product
Introducing Tier 1 Reachability: Precision CVE Triage for Enterprise Teams
Socket’s new Tier 1 Reachability filters out up to 80% of irrelevant CVEs, so security teams can focus on the vulnerabilities that matter.
ai-youtube-transcript
Advanced tools
Fetch and process transcripts from YouTube videos with support for multiple languages, translation, and formatting
A Node.js library for retrieving and processing YouTube video transcripts. This package uses the unofficial YouTube API to fetch transcripts without requiring an API key or headless browser.
npm install ai-youtube-transcript
or
yarn add ai-youtube-transcript
import { YoutubeTranscript } from 'ai-youtube-transcript';
// Create a new instance
const ytTranscript = new YoutubeTranscript();
// Fetch transcript with default options (English)
ytTranscript.fetch('VIDEO_ID_OR_URL')
.then(transcript => {
console.log(`Video ID: ${transcript.videoId}`);
console.log(`Language: ${transcript.language} (${transcript.languageCode})`);
console.log(`Auto-generated: ${transcript.isGenerated ? 'Yes' : 'No'}`);
console.log(`Number of segments: ${transcript.length}`);
// Get the full text
console.log(transcript.getText());
// Get raw data
console.log(transcript.toRawData());
})
.catch(error => {
console.error('Error:', error.message);
});
import { YoutubeTranscript } from 'ai-youtube-transcript';
// Using the static method (legacy approach)
YoutubeTranscript.fetchTranscript('VIDEO_ID_OR_URL')
.then(console.log)
.catch(console.error);
import { YoutubeTranscript } from 'ai-youtube-transcript';
const ytTranscript = new YoutubeTranscript();
// List all available transcripts
ytTranscript.list('VIDEO_ID_OR_URL')
.then(transcriptList => {
console.log('Available transcripts:');
for (const transcript of transcriptList) {
console.log(`- ${transcript.language} (${transcript.languageCode})`);
console.log(` Auto-generated: ${transcript.isGenerated ? 'Yes' : 'No'}`);
console.log(` Translatable: ${transcript.isTranslatable ? 'Yes' : 'No'}`);
if (transcript.isTranslatable) {
console.log(' Available translations:');
for (const lang of transcript.translationLanguages) {
console.log(` - ${lang.languageName} (${lang.languageCode})`);
}
}
}
})
.catch(error => {
console.error('Error:', error.message);
});
import { YoutubeTranscript } from 'ai-youtube-transcript';
const ytTranscript = new YoutubeTranscript();
// Fetch transcript with language preferences
ytTranscript.fetch('VIDEO_ID_OR_URL', {
languages: ['fr', 'en', 'es'], // Try French first, then English, then Spanish
preserveFormatting: true // Keep HTML formatting
})
.then(transcript => {
console.log(`Selected language: ${transcript.language} (${transcript.languageCode})`);
console.log(transcript.getText());
})
.catch(error => {
console.error('Error:', error.message);
});
Translation is a two-step process:
import { YoutubeTranscript } from 'ai-youtube-transcript';
const ytTranscript = new YoutubeTranscript();
// Get the list of available transcripts
ytTranscript.list('VIDEO_ID_OR_URL')
.then(async transcriptList => {
// Step 1: Find a transcript in English
const transcript = transcriptList.findTranscript(['en']);
// Check if it can be translated
if (transcript.isTranslatable) {
console.log(`Found translatable transcript in ${transcript.language}`);
console.log('Available translation languages:');
transcript.translationLanguages.forEach(lang => {
console.log(`- ${lang.languageName} (${lang.languageCode})`);
});
// Step 2: Translate to Spanish
const translatedTranscript = transcript.translate('es');
// Fetch the translated transcript
const fetchedTranslation = await translatedTranscript.fetch();
console.log(`Translated to Spanish: ${fetchedTranslation.getText().substring(0, 100)}...`);
} else {
console.log(`Transcript in ${transcript.language} is not translatable`);
}
})
.catch(error => {
console.error('Error:', error.message);
});
You can also do this in a single chain:
ytTranscript.list('VIDEO_ID_OR_URL')
.then(list => list.findTranscript(['en']))
.then(transcript => transcript.isTranslatable ? transcript.translate('es') : null)
.then(translatedTranscript => translatedTranscript ? translatedTranscript.fetch() : null)
.then(result => {
if (result) console.log(`Translated transcript: ${result.getText().substring(0, 100)}...`);
else console.log('Translation not available');
})
.catch(error => console.error('Error:', error.message));
import { YoutubeTranscript, JSONFormatter, TextFormatter, SRTFormatter } from 'ai-youtube-transcript';
const ytTranscript = new YoutubeTranscript();
ytTranscript.fetch('VIDEO_ID_OR_URL')
.then(transcript => {
// Format as JSON
const jsonFormatter = new JSONFormatter();
const jsonOutput = jsonFormatter.formatTranscript(transcript, { indent: 2 });
console.log(jsonOutput);
// Format as plain text
const textFormatter = new TextFormatter();
const textOutput = textFormatter.formatTranscript(transcript);
console.log(textOutput);
// Format as SRT
const srtFormatter = new SRTFormatter();
const srtOutput = srtFormatter.formatTranscript(transcript);
console.log(srtOutput);
})
.catch(error => {
console.error('Error:', error.message);
});
import { YoutubeTranscript } from 'ai-youtube-transcript';
// Create an instance with cookie authentication
const ytTranscript = new YoutubeTranscript('/path/to/cookies.txt');
// Now you can access age-restricted videos
ytTranscript.fetch('AGE_RESTRICTED_VIDEO_ID')
.then(transcript => {
console.log(transcript.getText());
})
.catch(error => {
console.error('Error:', error.message);
});
import { YoutubeTranscript, GenericProxyConfig, WebshareProxyConfig } from 'ai-youtube-transcript';
// Using a generic proxy
const genericProxy = new GenericProxyConfig(
'http://username:password@proxy-host:port',
'https://username:password@proxy-host:port'
);
const ytTranscript1 = new YoutubeTranscript(null, genericProxy);
// Using Webshare proxy
const webshareProxy = new WebshareProxyConfig('username', 'password');
const ytTranscript2 = new YoutubeTranscript(null, webshareProxy);
// Now use ytTranscript1 or ytTranscript2 as usual
import { YoutubeTranscript } from 'ai-youtube-transcript';
import fs from 'fs';
async function batchProcessVideos(videoIds) {
const ytTranscript = new YoutubeTranscript();
const results = [];
for (const videoId of videoIds) {
try {
console.log(`Processing video ${videoId}...`);
const transcript = await ytTranscript.fetch(videoId);
results.push({
videoId,
language: transcript.language,
text: transcript.getText(),
segments: transcript.length
});
console.log(`✅ Successfully processed ${videoId}`);
} catch (error) {
console.error(`❌ Error processing ${videoId}: ${error.message}`);
results.push({
videoId,
error: error.message
});
}
}
return results;
}
// Example usage
const videoIds = [
'dQw4w9WgXcQ', // Rick Astley - Never Gonna Give You Up
'UF8uR6Z6KLc', // Steve Jobs' 2005 Stanford Commencement Address
'YbJOTdZBX1g' // YouTube Rewind 2018
];
batchProcessVideos(videoIds)
.then(results => {
console.log(`Processed ${results.length} videos`);
fs.writeFileSync('results.json', JSON.stringify(results, null, 2));
});
import { YoutubeTranscript, JSONFormatter, TextFormatter, SRTFormatter } from 'ai-youtube-transcript';
import fs from 'fs';
import path from 'path';
async function saveTranscriptInMultipleFormats(videoId, outputDir) {
const ytTranscript = new YoutubeTranscript();
try {
// Create output directory if it doesn't exist
if (!fs.existsSync(outputDir)) {
fs.mkdirSync(outputDir, { recursive: true });
}
// Fetch the transcript
const transcript = await ytTranscript.fetch(videoId);
// Save as JSON
const jsonFormatter = new JSONFormatter();
const jsonOutput = jsonFormatter.formatTranscript(transcript, { indent: 2 });
fs.writeFileSync(
path.join(outputDir, `${videoId}.json`),
jsonOutput
);
// Save as plain text
const textFormatter = new TextFormatter();
const textOutput = textFormatter.formatTranscript(transcript);
fs.writeFileSync(
path.join(outputDir, `${videoId}.txt`),
textOutput
);
// Save as SRT
const srtFormatter = new SRTFormatter();
const srtOutput = srtFormatter.formatTranscript(transcript);
fs.writeFileSync(
path.join(outputDir, `${videoId}.srt`),
srtOutput
);
console.log(`Transcript for ${videoId} saved in multiple formats to ${outputDir}`);
return true;
} catch (error) {
console.error(`Error saving transcript for ${videoId}: ${error.message}`);
return false;
}
}
// Example usage
saveTranscriptInMultipleFormats('dQw4w9WgXcQ', './transcripts');
The package includes a command-line interface for easy transcript retrieval:
npx ai-youtube-transcript <videoId> [options]
Options:
--languages, -l <langs> Comma-separated list of language codes in order of preference (default: en)
--format, -f <format> Output format: text, json, srt (default: text)
--output, -o <file> Write output to a file instead of stdout
--translate, -t <lang> Translate transcript to the specified language (can be combined with --languages)
--list-transcripts List all available transcripts for the video
--exclude-generated Only use manually created transcripts
--exclude-manually-created Only use automatically generated transcripts
--preserve-formatting Preserve HTML formatting in the transcript
--cookies <path> Path to cookies.txt file for authentication
--http-proxy <url> HTTP proxy URL
--https-proxy <url> HTTPS proxy URL
--webshare-proxy-username <u> Webshare proxy username
--webshare-proxy-password <p> Webshare proxy password
--help, -h Show this help message
# Basic usage
npx ai-youtube-transcript dQw4w9WgXcQ
# Specify languages
npx ai-youtube-transcript dQw4w9WgXcQ --languages fr,en,es
# Output as JSON to a file
npx ai-youtube-transcript dQw4w9WgXcQ --format json --output transcript.json
# Translate to German
npx ai-youtube-transcript dQw4w9WgXcQ --translate de
# Find a French transcript and translate it to German
npx ai-youtube-transcript dQw4w9WgXcQ --languages fr --translate de
# List available transcripts
npx ai-youtube-transcript --list-transcripts dQw4w9WgXcQ
# Use with proxy
npx ai-youtube-transcript dQw4w9WgXcQ --webshare-proxy-username "user" --webshare-proxy-password "pass"
The main class for retrieving transcripts from YouTube videos.
new YoutubeTranscript(cookiePath?: string, proxyConfig?: ProxyConfig)
cookiePath
(optional): Path to a cookies.txt file for authenticationproxyConfig
(optional): Proxy configuration for handling IP bansfetch(videoId: string, config?: TranscriptConfig): Promise<FetchedTranscript>
videoId
: YouTube video ID or URLconfig
: Configuration options (languages, formatting)list(videoId: string): Promise<TranscriptList>
videoId
: YouTube video ID or URLstatic fetchTranscript(videoId: string, config?: TranscriptConfig): Promise<TranscriptResponse[]>
Represents a transcript with metadata.
videoId
: YouTube video IDlanguage
: Language namelanguageCode
: Language codeisGenerated
: Whether the transcript is auto-generatedisTranslatable
: Whether the transcript can be translatedtranslationLanguages
: Available translation languagesfetch(preserveFormatting?: boolean): Promise<FetchedTranscript>
preserveFormatting
: Whether to preserve HTML formattingtranslate(languageCode: string): Transcript
languageCode
: Target language codeRepresents a list of available transcripts for a video.
findTranscript(languageCodes: string[]): Transcript
languageCodes
: List of language codes in order of preferencefindManuallyCreatedTranscript(languageCodes: string[]): Transcript
findGeneratedTranscript(languageCodes: string[]): Transcript
getTranscripts(): Transcript[]
Represents the actual transcript data with snippets.
snippets
: Array of transcript snippetsvideoId
: YouTube video IDlanguage
: Language namelanguageCode
: Language codeisGenerated
: Whether the transcript is auto-generatedlength
: Number of snippetstoRawData(): TranscriptResponse[]
getText(): string
const formatter = new JSONFormatter();
const output = formatter.formatTranscript(transcript, { indent: 2 });
const formatter = new TextFormatter();
const output = formatter.formatTranscript(transcript);
const formatter = new SRTFormatter();
const output = formatter.formatTranscript(transcript);
const proxyConfig = new GenericProxyConfig(
'http://username:password@proxy-host:port', // HTTP proxy URL
'https://username:password@proxy-host:port' // HTTPS proxy URL
);
const proxyConfig = new WebshareProxyConfig(
'username', // Webshare username
'password' // Webshare password
);
If you get a YoutubeTranscriptNotAvailableError
, it means the video doesn't have any transcripts available. This can happen if:
If you get a YoutubeTranscriptNotAvailableLanguageError
, it means the requested language is not available for this video. Use the list
method to see available languages:
ytTranscript.list('VIDEO_ID')
.then(transcriptList => {
console.log('Available languages:');
for (const transcript of transcriptList) {
console.log(`- ${transcript.languageCode} (${transcript.language})`);
}
});
If you get a YoutubeTranscriptTooManyRequestError
, it means YouTube is blocking your requests due to rate limiting. Solutions:
If you get an error about an invalid video ID, make sure you're using a correct YouTube video ID or URL. The library supports various URL formats:
// All of these are valid
ytTranscript.fetch('dQw4w9WgXcQ');
ytTranscript.fetch('https://www.youtube.com/watch?v=dQw4w9WgXcQ');
ytTranscript.fetch('https://youtu.be/dQw4w9WgXcQ');
ytTranscript.fetch('https://www.youtube.com/embed/dQw4w9WgXcQ');
If you're having trouble with translation, keep in mind how the translation process works:
--languages
or -l
--translate
or -t
is specified, that transcript is translated to the target languageFor example:
--languages en
finds an English transcript--translate fr
translates the found transcript to French--languages en --translate fr
finds an English transcript and translates it to FrenchIf translation fails, it could be because:
Use --list-transcripts
to see which transcripts are available and which ones are translatable.
It's recommended to implement proper error handling in your application:
ytTranscript.fetch('VIDEO_ID')
.then(transcript => {
// Success
console.log(transcript.getText());
})
.catch(error => {
if (error.name === 'YoutubeTranscriptNotAvailableError') {
console.error('No transcripts available for this video');
} else if (error.name === 'YoutubeTranscriptNotAvailableLanguageError') {
console.error('Requested language not available');
} else if (error.name === 'YoutubeTranscriptTooManyRequestError') {
console.error('Rate limited by YouTube, try again later or use a proxy');
} else {
console.error('Unexpected error:', error.message);
}
});
git clone https://github.com/yourusername/ai-youtube-transcript.git
cd ai-youtube-transcript
npm install
npm run build
The project includes both unit and integration tests:
# Run all tests
npm test
# Run tests with coverage
npm run test:coverage
Contributions are welcome! Here's how you can contribute:
git checkout -b feature/your-feature-name
npm test
git commit -m 'Add some feature'
git push origin feature/your-feature-name
Please make sure your code follows the existing style and includes appropriate tests.
This package uses an undocumented part of the YouTube API, which is called by the YouTube web client. There is no guarantee that it won't stop working if YouTube changes their API. We will do our best to keep it updated if that happens.
MIT Licensed
FAQs
Fetch and process transcripts from YouTube videos with support for multiple languages, translation, and formatting
We found that ai-youtube-transcript demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Product
Socket’s new Tier 1 Reachability filters out up to 80% of irrelevant CVEs, so security teams can focus on the vulnerabilities that matter.
Research
/Security News
Ongoing npm supply chain attack spreads to DuckDB: multiple packages compromised with the same wallet-drainer malware.
Security News
The MCP Steering Committee has launched the official MCP Registry in preview, a central hub for discovering and publishing MCP servers.