Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement
Sign In

vidpipe

Package Overview
Dependencies
Maintainers
1
Versions
40
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

vidpipe

AI-powered pipeline that watches for video recordings and generates transcripts, summaries, short clips, and social media posts

latest
Source
npmnpm
Version
1.3.33
Version published
Maintainers
1
Created
Source
 ██╗   ██╗██╗██████╗ ██████╗ ██╗██████╗ ███████╗
 ██║   ██║██║██╔══██╗██╔══██╗██║██╔══██╗██╔════╝
 ██║   ██║██║██║  ██║██████╔╝██║██████╔╝█████╗  
 ╚██╗ ██╔╝██║██║  ██║██╔═══╝ ██║██╔═══╝ ██╔══╝  
  ╚████╔╝ ██║██████╔╝██║     ██║██║     ███████╗
   ╚═══╝  ╚═╝╚═════╝ ╚═╝     ╚═╝╚═╝     ╚══════╝

Your AI video editor and content ideation engine — turn raw recordings into shorts, reels, captions, social posts, and blog posts. Ideate, record, edit, publish.

An agentic video editor and content ideation platform that watches for new recordings and edits them into social-media-ready content — shorts, reels, captions, blog posts, and platform-tailored social posts — using GitHub Copilot SDK AI agents, OpenAI Whisper, and Google Gemini.

CI npm version Node.js 20+ License: ISC Docs Last Updated

npm install -g vidpipe

✨ Features

VidPipe Features — Input → AI Processing → Outputs


💡 Content Ideation (ID8) — AI-generated, trend-backed video ideas🎙️ Whisper Transcription — Word-level timestamps
📐 Split-Screen Layouts — Portrait, square, and feed🔇 AI Silence Removal — Context-aware, capped at 20%
💬 Karaoke Captions — Word-by-word highlighting✂️ Short Clips — Best 15–60s moments, hook-first ordering
🎞️ Medium Clips — 1–3 min with crossfade transitions📑 Chapter Detection — JSON, Markdown, YouTube, FFmeta
📱 Social Posts — TikTok, YouTube, Instagram, LinkedIn, X📰 Blog Post — Dev.to style with web-sourced links
🎨 Brand Voice — Custom tone, hashtags via brand.json🔍 Face Detection — ONNX-based webcam cropping
🚀 Auto-Publish — Scheduled posting via Late API👁️ Gemini Vision — AI video analysis and scene detection

🚀 Quick Start

# Install globally
npm install -g vidpipe

# Set up your environment
# Unix/Mac
cp .env.example .env
# Windows (PowerShell)
Copy-Item .env.example .env

# Then edit .env and add your OpenAI API key (REQUIRED):
#   OPENAI_API_KEY=sk-your-key-here

# Verify all prerequisites are met
vidpipe --doctor

# Process a single video
vidpipe /path/to/video.mp4

# Watch a folder for new recordings
vidpipe --watch-dir ~/Videos/Recordings

# Generate a saved idea bank for future recordings
vidpipe ideate --topics "GitHub Copilot, Azure, TypeScript" --count 4

# Add a single idea with AI enrichment
vidpipe ideate --add --topic "Building CI/CD with GitHub Actions"

# Full example with options
vidpipe \
  --watch-dir ~/Videos/Recordings \
  --output-dir ~/Content/processed \
  --openai-key sk-... \
  --brand ./brand.json \
  --verbose

Prerequisites:

  • Node.js 20+
  • FFmpeg 6.0+ — Auto-bundled on common platforms (Windows x64, macOS, Linux x64) via ffmpeg-static. On other architectures, install system FFmpeg (see Troubleshooting). Override with FFMPEG_PATH env var if you need a specific build.
  • OpenAI API key (required) — Get one at platform.openai.com/api-keys. Needed for Whisper transcription and all AI features.
  • GitHub Copilot subscription — Required for AI agent features (shorts generation, social media posts, summaries, blog posts). See GitHub Copilot.

See Getting Started for full setup instructions.

🎮 CLI Usage

vidpipe [options] [video-path]
vidpipe init              # Interactive setup wizard
vidpipe review            # Open post review web app
vidpipe schedule          # View posting schedule
vidpipe realign           # Realign scheduled posts to match schedule.json
vidpipe realign --queue   # Queue-based realignment (reshuffleExisting)
vidpipe sync-queues       # Sync schedule.json queue definitions to Late API
vidpipe reschedule        # Reschedule idea-linked posts for optimal placement
vidpipe ideate            # Generate or list saved content ideas
vidpipe chat              # Interactive schedule management agent
vidpipe doctor            # Check all prerequisites

Process Options

OptionDescription
[video-path]Process a specific video file (implies --once)
--watch-dir <path>Folder to watch for new recordings
--output-dir <path>Output directory (default: ./recordings)
--openai-key <key>OpenAI API key
--exa-key <key>Exa AI key for web search in social posts
--brand <path>Path to brand.json (default: ./brand.json)
--ideas <ids>Comma-separated idea IDs to link to this video
--onceProcess next video and exit
--no-silence-removalSkip silence removal
--no-shortsSkip short clip extraction
--no-medium-clipsSkip medium clip generation
--no-socialSkip social media posts
--no-social-publishSkip social media queue-build stage
--no-captionsSkip caption generation/burning
--late-api-key <key>Override Late API key
-v, --verboseDebug-level logging
--progressEmit structured JSON progress events to stderr
--doctorCheck that all prerequisites are installed

Ideate Options

OptionDescription
--topics <topics>Comma-separated seed topics for trend research
--count <n>Number of ideas to generate (default: 5)
--listList existing ideas instead of generating
--status <status>Filter by status: draft, ready, recorded, published
--format <format>Output format: table (default) or json
--output <dir>Ideas directory (default: ./ideas)
--brand <path>Brand config path (default: ./brand.json)
--addCreate a single idea (AI-enriched by default)
--topic <topic>Topic for the idea (required with --add)
--hook <hook>Opening hook (AI-generated if omitted)
--audience <audience>Target audience (default: "developers")
--platforms <list>Comma-separated platforms: youtube,tiktok,instagram,linkedin,x
--key-takeaway <msg>Core message (AI-generated if omitted)
--talking-points <list>Comma-separated talking points
--tags <list>Comma-separated categorization tags
--publish-by <date>Publish-by date (default: 14 days from now)
--trend-context <text>Trend research context
--no-aiSkip AI research agent, use CLI values + defaults

📁 Output Structure

recordings/
└── my-awesome-demo/
    ├── my-awesome-demo.mp4                  # Original video
    ├── my-awesome-demo-edited.mp4           # Silence-removed
    ├── my-awesome-demo-captioned.mp4        # With burned-in captions
    ├── transcript.json                      # Word-level transcript
    ├── transcript-edited.json               # Timestamps adjusted for silence removal
    ├── README.md                            # AI-generated summary with screenshots
    ├── captions/
    │   ├── captions.srt                     # SubRip subtitles
    │   ├── captions.vtt                     # WebVTT subtitles
    │   └── captions.ass                     # Advanced SSA (karaoke-style)
    ├── shorts/
    │   ├── catchy-title.mp4                 # Landscape base clip
    │   ├── catchy-title-captioned.mp4       # Landscape + burned captions
    │   ├── catchy-title-portrait.mp4        # 9:16 split-screen
    │   ├── catchy-title-portrait-captioned.mp4  # Portrait + captions + hook overlay
    │   ├── catchy-title-feed.mp4            # 4:5 split-screen
    │   ├── catchy-title-square.mp4          # 1:1 split-screen
    │   ├── catchy-title.md                  # Clip metadata
    │   └── catchy-title/
    │       └── posts/                       # Per-short social posts (5 platforms)
    ├── medium-clips/
    │   ├── deep-dive-topic.mp4              # Landscape base clip
    │   ├── deep-dive-topic-captioned.mp4    # With burned captions
    │   ├── deep-dive-topic.md               # Clip metadata
    │   └── deep-dive-topic/
    │       └── posts/                       # Per-clip social posts (5 platforms)
    ├── chapters/
    │   ├── chapters.json                    # Structured chapter data
    │   ├── chapters.md                      # Markdown table
    │   ├── chapters.ffmetadata              # FFmpeg metadata format
    │   └── chapters-youtube.txt             # YouTube description timestamps
    └── social-posts/
        ├── tiktok.md                        # Full-video social posts
        ├── youtube.md
        ├── instagram.md
        ├── linkedin.md
        ├── x.md
        └── devto.md                         # Dev.to blog post

💡 Content Ideation (ID8)

VidPipe includes a research-backed content ideation engine that generates video ideas before you record. Ideas are stored as GitHub Issues for full lifecycle tracking.

# Generate ideas backed by trend research
vidpipe ideate --topics "GitHub Copilot, TypeScript" --count 4

# List all saved ideas
vidpipe ideate --list

# Filter by status
vidpipe ideate --list --status ready

# JSON output for programmatic access (e.g., VidRecord integration)
vidpipe ideate --list --format json

# Link ideas to a recording
vidpipe process video.mp4 --ideas 12,15

Manual Idea Creation

Add a single idea with AI enrichment or direct CLI values:

# AI-researched — full IdeationAgent with MCP research tools
vidpipe ideate --add --topic "Building CI/CD with GitHub Actions"

# Direct — skip AI, use CLI flags + defaults
vidpipe ideate --add --topic "Quick Demo" --no-ai --hook "Ship it live" --audience "developers"

# JSON output for programmatic consumers (e.g., VidRecord Electron app)
vidpipe ideate --add --topic "My Topic" --format json

How It Works

The IdeationAgent uses MCP tools (Exa web search, YouTube, Perplexity) to research trending topics in your niche before generating ideas. Each idea includes:

  • Topic & hook — The angle that makes it compelling
  • Audience & key takeaway — Who it's for and what they'll learn
  • Talking points — Structured bullet points to guide your recording
  • Publish-by date — Based on timeliness (3–5 days for hot trends, months for evergreen)
  • Trend context — The research findings that back the idea

Idea Lifecycle

draft → ready → recorded → published
StatusMeaning
draftGenerated by AI, awaiting your review
readyApproved — ready to record
recordedLinked to a video via --ideas flag
publishedContent from this idea has been published

Ideas automatically influence downstream content — when you link ideas to a recording with --ideas, the pipeline's agents (shorts, social posts, summaries, blog) reference your intended topic and hook for more focused output.

📺 Review App

VidPipe includes a built-in web app for reviewing, editing, and scheduling social media posts before publishing.

VidPipe Review UI
Review and approve posts across YouTube, TikTok, Instagram, LinkedIn, and X/Twitter
# Launch the review app
vidpipe review
  • Platform tabs — Filter posts by platform (YouTube, TikTok, Instagram, LinkedIn, X)
  • Video preview — See the video thumbnail and content before approving
  • Keyboard shortcuts — Arrow keys to navigate, Enter to approve, Backspace to reject
  • Smart scheduling — Posts are queued with optimal timing per platform

🔄 Pipeline

graph LR
    A[📥 Ingest] --> B[🎙️ Transcribe]
    B --> C[🔇 Silence Removal]
    C --> D[💬 Captions]
    D --> E[🔥 Caption Burn]
    E --> F[✂️ Shorts]
    F --> G[🎞️ Medium Clips]
    G --> H[📑 Chapters]
    H --> I[📝 Summary]
    I --> J[📱 Social Media]
    J --> K[📱 Short Posts]
    K --> L[📱 Medium Posts]
    L --> M[📰 Blog]
    M --> N[📦 Queue Build]

    style A fill:#2d5a27,stroke:#4ade80
    style B fill:#1e3a5f,stroke:#60a5fa
    style E fill:#5a2d27,stroke:#f87171
    style F fill:#5a4d27,stroke:#fbbf24
    style N fill:#2d5a27,stroke:#4ade80
#StageDescription
1IngestionCopies video, extracts metadata with FFprobe
2TranscriptionExtracts audio → OpenAI Whisper for word-level transcription
3Silence RemovalAI detects dead-air segments; context-aware removals capped at 20%
4CaptionsGenerates .srt, .vtt, and .ass subtitle files with karaoke word highlighting
5Caption BurnBurns ASS captions into video (single-pass encode when silence was also removed)
6ShortsAI identifies best 15–60s moments; extracts single and composite clips with 6 variants per short
7Medium ClipsAI identifies 1–3 min standalone segments with crossfade transitions
8ChaptersAI detects topic boundaries; outputs JSON, Markdown, FFmetadata, and YouTube timestamps
9SummaryAI writes a Markdown README with captured screenshots
10Social MediaPlatform-tailored posts for TikTok, YouTube, Instagram, LinkedIn, and X
11Short PostsPer-short social media posts for all 5 platforms
12Medium Clip PostsPer-medium-clip social media posts for all 5 platforms
13BlogDev.to blog post with frontmatter, web-sourced links via Exa
14Queue BuildBuilds publish queue from social posts with scheduled slots

Each stage can be independently skipped with --no-* flags. A stage failure does not abort the pipeline — subsequent stages proceed with whatever data is available.

Progress Events

Pass --progress to emit structured JSONL progress events to stderr while normal logs continue on stdout:

vidpipe process video.mp4 --progress 2>progress.jsonl

Each line is a self-contained JSON object:

{"event":"pipeline:start","videoPath":"video.mp4","totalStages":16,"timestamp":"..."}
{"event":"stage:start","stage":"ingestion","stageNumber":1,"totalStages":16,"name":"Ingestion","timestamp":"..."}
{"event":"stage:complete","stage":"ingestion","stageNumber":1,"totalStages":16,"name":"Ingestion","duration":423,"success":true,"timestamp":"..."}
{"event":"stage:skip","stage":"shorts","stageNumber":7,"totalStages":16,"name":"Shorts","reason":"SKIP_SHORTS","timestamp":"..."}
{"event":"pipeline:complete","totalDuration":45000,"stagesCompleted":14,"stagesFailed":0,"stagesSkipped":2,"timestamp":"..."}

Event types: pipeline:start, stage:start, stage:complete, stage:error, stage:skip, pipeline:complete.

Integrating tools can read stderr line-by-line to display a live progress UI (e.g., "Stage 3/16: Silence Removal").

🤖 LLM Providers

VidPipe supports multiple LLM providers:

ProviderEnv VarDefault ModelNotes
copilot (default)Claude Opus 4.6Uses GitHub Copilot auth
openaiOPENAI_API_KEYgpt-4oDirect OpenAI API
claudeANTHROPIC_API_KEYclaude-opus-4.6Direct Anthropic API

Set LLM_PROVIDER in your .env or pass via CLI. Override model with LLM_MODEL.

The pipeline tracks token usage and estimated cost across all providers, displaying a summary at the end of each run.

⚙️ Configuration

Configuration is loaded from CLI flags → environment variables → .env file → defaults.

# .env
OPENAI_API_KEY=sk-your-key-here
WATCH_FOLDER=/path/to/recordings
OUTPUT_DIR=/path/to/output
# EXA_API_KEY=your-exa-key       # Optional: enables web search in social/blog posts
# BRAND_PATH=./brand.json         # Optional: path to brand voice config
# FFMPEG_PATH=/usr/local/bin/ffmpeg
# FFPROBE_PATH=/usr/local/bin/ffprobe
# LATE_API_KEY=sk_your_key_here   # Optional: Late API for social publishing
# GITHUB_TOKEN=ghp_...            # Optional: GitHub token for ID8 idea storage
# IDEAS_REPO=owner/repo           # Optional: GitHub repo for storing ideas as Issues

Social media publishing is configured via schedule.json and the Late API. See Social Publishing Guide for details.

📚 Documentation

GuideDescription
Getting StartedPrerequisites, installation, and first run
ConfigurationAll CLI flags, env vars, skip options, and examples
FFmpeg SetupPlatform-specific install (Windows, macOS, Linux, ARM64)
Brand CustomizationCustomize AI voice, vocabulary, hashtags, and content style
Social PublishingReview, schedule, and publish social posts via Late API
Architecture (L0–L7)Layer hierarchy, import rules, and testing strategy
Platform Content StrategyResearch-backed recommendations per social platform

Full reference docs are available at htekdev.github.io/vidpipe.

🏗️ Architecture

VidPipe uses a strict L0–L7 layered architecture where each layer can only import from specific lower layers. This enforces clean separation of concerns and makes every layer independently testable.

L7-app         CLI, servers, watchers          → L0, L1, L3, L6
L6-pipeline    Stage orchestration             → L0, L1, L5
L5-assets      Lazy-loaded asset + bridges     → L0, L1, L4
L4-agents      LLM agents (BaseAgent)          → L0, L1, L3
L3-services    Business logic + cost tracking  → L0, L1, L2
L2-clients     External API/process wrappers   → L0, L1
L1-infra       Infrastructure (config, logger) → L0
L0-pure        Pure functions, zero I/O        → (nothing)

Each editing task is handled by a specialized AI agent built on the GitHub Copilot SDK:

graph TD
    BP[🧠 BaseAgent] --> SRA[SilenceRemovalAgent]
    BP --> SA[SummaryAgent]
    BP --> SHA[ShortsAgent]
    BP --> MVA[MediumVideoAgent]
    BP --> CA[ChapterAgent]
    BP --> SMA[SocialMediaAgent]
    BP --> BA[BlogAgent]
    BP --> IA[IdeationAgent]

    SRA -->|tools| T1[detect_silence, decide_removals]
    SHA -->|tools| T2[plan_shorts]
    MVA -->|tools| T3[plan_medium_clips]
    CA -->|tools| T4[generate_chapters]
    SA -->|tools| T5[capture_frame, write_summary]
    SMA -->|tools| T6[search_links, create_posts]
    BA -->|tools| T7[search_web, write_blog]
    IA -->|tools| T8[web_search, youtube_search, generate_ideas]

    style BP fill:#1e3a5f,stroke:#60a5fa,color:#fff
    style IA fill:#5a4d27,stroke:#fbbf24,color:#fff

Each agent communicates with the LLM through structured tool calls, ensuring reliable, parseable outputs. See the Architecture Guide for full details on layer rules and import enforcement.

🛠️ Tech Stack

TechnologyPurpose
TypeScriptLanguage (ES2022, ESM)
GitHub Copilot SDKAI agent framework
OpenAI WhisperSpeech-to-text
Google GeminiVision-based video analysis
FFmpegVideo/audio processing
SharpImage analysis (webcam detection)
OctokitGitHub API (idea storage as Issues)
Commander.jsCLI framework
ChokidarFile system watching
WinstonLogging
Exa AIWeb search for social posts, blog, and ideation

🗺️ Roadmap

  • Automated social posting — Publish directly to platforms via Late API
  • Content ideation (ID8) — AI-generated, trend-backed video ideas with lifecycle tracking
  • Gemini Vision integration — AI-powered video analysis and scene detection
  • L0–L7 layered architecture — Strict separation of concerns with import enforcement
  • GitHub agentic workflows — Automated issue and PR triage via GitHub Actions
  • Hook-first clip ordering — Most engaging moment plays first in shorts
  • Multi-language support — Transcription and summaries in multiple languages
  • Custom templates — User-defined Markdown & social post templates
  • Batch processing — Process an entire folder of existing videos
  • Thumbnail generation — Auto-generate branded thumbnails for shorts

🔧 Troubleshooting

No binary found for architecture during install

ffmpeg-static (an optional dependency) bundles FFmpeg for common platforms. On unsupported architectures, it skips gracefully and vidpipe falls back to your system FFmpeg.

Fix: Install FFmpeg on your system:

  • Windows: winget install Gyan.FFmpeg
  • macOS: brew install ffmpeg
  • Linux: sudo apt install ffmpeg (Debian/Ubuntu) or sudo dnf install ffmpeg (Fedora)

You can also point to a custom binary: export FFMPEG_PATH=/path/to/ffmpeg

Run vidpipe doctor to verify your setup.

📄 License

ISC © htekdev

🧩 SDK Usage

VidPipe also ships as a Node.js ESM SDK for programmatic use:

import { createVidPipe } from 'vidpipe'

const vidpipe = createVidPipe({
  openaiApiKey: process.env.OPENAI_API_KEY,
  outputDir: './recordings',
})

const result = await vidpipe.processVideo('./videos/demo.mp4', {
  skipGit: true,
})

console.log(result.video.videoDir)
console.log(result.shorts.length)

SDK features include:

  • processVideo() for the full pipeline
  • ideate() plus ideas.* CRUD helpers
  • schedule.* helpers for slots, calendar, and realignment
  • video.* helpers for clips, captions, silence detection, variants, and frames
  • social.generatePosts() for quick platform-specific drafts
  • doctor() and config.* for diagnostics and configuration access

See docs/sdk.md for the full SDK guide.

Keywords

video

FAQs

Package last updated on 03 Jun 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts