
Product
Introducing Tier 1 Reachability: Precision CVE Triage for Enterprise Teams
Socket’s new Tier 1 Reachability filters out up to 80% of irrelevant CVEs, so security teams can focus on the vulnerabilities that matter.
bun add yaytt
npm install yaytt
yarn add yaytt
pnpm add yaytt
import { extractCaptions } from "yaytt";
const captions = await extractCaptions("WcBA3QEXJ2o");
const englishCaptions = await extractCaptions("WcBA3QEXJ2o", { lang: "en" });
const captions = await extractCaptions(
"https://www.youtube.com/watch?v=WcBA3QEXJ2o",
);
import { extractCaptions } from "yaytt";
const cleanCaptions = await extractCaptions("WcBA3QEXJ2o", {
deduplicationOptions: {
aggressiveMode: true, // Maximum deduplication
},
});
import { getAvailableLanguages } from "yaytt";
const languages = await getAvailableLanguages("WcBA3QEXJ2o");
console.log(languages);
// [{ code: 'pt', name: 'Portuguese (auto-generated)', isAutomatic: true }]
import { YouTubeCaptionExtractor } from "yaytt";
const extractor = new YouTubeCaptionExtractor({
userAgent: "MyApp/1.0",
timeout: 15000,
rateLimitDelay: 3000,
});
const captions = await extractor.extractCaptions("WcBA3QEXJ2o", {
lang: "pt",
retries: 3,
deduplicate: true,
deduplicationOptions: {
timeThreshold: 3, // Seconds
similarityThreshold: 0.8, // 80% similarity
mergePartialMatches: true,
aggressiveMode: false, // Set to true for maximum deduplication
},
});
npx yaytt WcBA3QEXJ2o
npx yaytt WcBA3QEXJ2o --aggressive
npx yaytt "https://www.youtube.com/watch?v=WcBA3QEXJ2o"
extractCaptions(videoIdOrUrl, options?)
Extract captions from a YouTube video.
Parameters:
videoIdOrUrl
(string): YouTube video ID or full URLoptions
(object, optional):
lang
(string): Language code (default: 'pt' for Portuguese)deduplicate
(boolean): Enable deduplication (default: true)deduplicationOptions
(object): Deduplication settingsReturns: Promise<Caption[]>
getAvailableLanguages(videoIdOrUrl)
Get all available caption languages for a video.
Parameters:
videoIdOrUrl
(string): YouTube video ID or full URLReturns: Promise<{ code: string, name: string, isAutomatic: boolean }[]>
interface Caption {
start: number; // Start time in seconds
dur: number; // Duration in seconds
text: string; // Caption text
}
interface CaptionOptions {
lang?: string;
retries?: number;
fallback?: boolean;
deduplicate?: boolean;
deduplicationOptions?: {
timeThreshold?: number; // Default: 3 seconds
similarityThreshold?: number; // Default: 0.8 (80% similarity)
mergePartialMatches?: boolean; // Default: true
aggressiveMode?: boolean; // Default: false
};
}
YouTube's auto-generated captions often contain overlapping segments:
Before:
[0:00] [Música]
[0:00] [Música] O podcast que você ouve agora é uma
[0:02] O podcast que você ouve agora é uma
[0:02] O podcast que você ouve agora é uma produção da Central 3.
After:
[0:02] O podcast que você ouve agora é uma produção da Central 3.
Results:
import { extractCaptions, CaptionExtractionError } from "yaytt";
try {
const captions = await extractCaptions("invalid-video-id");
} catch (error) {
if (error instanceof CaptionExtractionError) {
console.error(`Caption extraction failed: ${error.message}`);
console.error(`Video ID: ${error.videoId}`);
}
}
MIT
FAQs
Blazingly fast YouTube caption extractor with deduplication.
We found that yaytt demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Product
Socket’s new Tier 1 Reachability filters out up to 80% of irrelevant CVEs, so security teams can focus on the vulnerabilities that matter.
Research
/Security News
Ongoing npm supply chain attack spreads to DuckDB: multiple packages compromised with the same wallet-drainer malware.
Security News
The MCP Steering Committee has launched the official MCP Registry in preview, a central hub for discovering and publishing MCP servers.