
Security News
Attackers Are Hunting High-Impact Node.js Maintainers in a Coordinated Social Engineering Campaign
Multiple high-impact npm maintainers confirm they have been targeted in the same social engineering campaign that compromised Axios.
Audio-driven talking head animator with Max Headroom glitchy aesthetic
Based on npm's sardonic but evolved with audio-synchronized mouth animation
"M-M-M-caricature reporting for d-d-duty!"
Max Headroom was groundbreaking. In 1985, that stuttering, glitching, rotating talking head was like nothing else on television. The genius wasn't just the character—it was the chaos:
Nobody has ever beaten Max Headroom. All we can do is pay homage.
caricature brings that aesthetic to your audio content with real synchronization. A talking head that opens its mouth to your audio, with random variation in the open frames—combining the precision of lip-sync with Max Headroom's beautiful chaos.
Takes character frames (closed and open mouth expressions) and creates an audio-synchronized talking head animation by:
The result: A character that actually talks to your audio, with natural variation from random frame selection.
# Create audio-synchronized talking head
caricature --audio narration.mp3
# Use different character with custom threshold
caricature --audio speech.mp4 --character character2 --threshold -40
# Full glitch chaos
caricature --audio podcast.wav --glitch 3 --rotation 30 --threshold -45
# Overlay on existing video
caricature -a audio.mp3 -o talking.mp4
caricature -O background.mp4 -o final.mp4
Requires:
npm install -g caricature
# or
npx caricature --audio your-audio.mp3
caricature \
--audio narration.mp3 \
--character character1 \
--size 320 \
--glitch 2
# More sensitive (mouth opens more often)
caricature --audio quiet-speech.mp3 --threshold -40
# Less sensitive (only loud sounds trigger)
caricature --audio loud-music.mp3 --threshold -30
# Two-step process
caricature --audio speech.mp3 -o talking.mp4
caricature --overlay background.mp4 -o final.mp4
-a, --audio <file> - Audio or video file to synchronize with-c, --character <name> - Character name (default: character1)-D, --dir <path> - Directory with frames (default: samples/)-t, --threshold <dB> - Loudness threshold for mouth open (default: -35)-s, --size <pixels> - Output size, square (default: 320)-r, --rotation <deg> - Max rotation angle (default: 15)-g, --glitch <0-3> - Glitch level (default: 1)-o, --output <file> - Output filename (default: caricature.mp4)-O, --overlay <video> - Input video to overlay on-p, --position <pos> - Position: bottom-right, bottom-left, top-right, top-left-m, --margin <pixels> - Margin from edges (default: 20)Level 0: Clean
Level 1: Classic (Default)
Level 2: Medium Chaos
Level 3: MAXIMUM CHAOS
caricature requires characters with closed and open mouth frames. This is the naming convention:
character1-closed1.jpg # Closed mouth (required)
character1-closed2.jpg # Additional closed (optional)
character1-open1.jpg # Open mouth (required)
character1-open2.jpg # More open variations (optional)
character1-open3.jpg # Even more! (optional)
The Magic: When audio is loud, caricature randomly picks from your open frames. This creates natural variation - the same speaking pattern never looks identical twice!
This is where it gets powerful. Here's the strategy:
sunglasses cat portrait, 80s aesthetic, neon colors,
pixelated background, VHS quality, Max Headroom style,
mouth closed, front facing
--ar 1:1 --stylize 750
[paste image URL] --cref [character URL] --cw 100
Required:
- mouth closed (neutral expression)
- mouth open (speaking)
Optional variations for randomness:
- mouth wide open
- mouth slightly open
- mouth open with teeth
- mouth open at angle
[paste image URL] VHS distortion, signal interference,
scan lines, color bleeding, tracking errors --cref [character URL]
# Critical: Follow the naming convention!
character1-closed1.jpg # Main closed mouth
character1-open1.jpg # Main open mouth
character1-open2.jpg # Variation 1
character1-open3.jpg # Variation 2
caricature --audio narration.mp3 \
--character character1 \
--threshold -35 \
--glitch 2 \
--rotation 20
The magic: Audio analysis + random open frames + random rotation = natural talking
FFmpeg extracts loudness using the astats filter:
ffprobe -f lavfi -i "amovie=file.mp3,astats=metadata=1:reset=1"
Each frame gets a loudness value in dB (typically -60 dB to 0 dB)
Threshold determines mouth state:
Random selection preserves chaos: Even at same loudness level, different open frames are chosen
-45 dB # Very sensitive - mouth opens for whispers
-40 dB # Sensitive - good for quiet speech
-35 dB # Default - balanced for normal speech
-30 dB # Less sensitive - only moderate sounds trigger
-25 dB # Very insensitive - only loud sounds trigger
Pro tip: Analyze your audio first:
ffprobe -f lavfi -i "amovie=your-file.mp3,astats=1" \
-show_entries frame_tags=lavfi.astats.Overall.RMS_level
Look at the dB values and set threshold slightly below average speech level.
The human brain is incredible at pattern recognition. When we see:
We perceive: "This character is actually talking!"
The key insight: Perfect sync would look robotic. By randomly selecting from multiple open mouth frames, we get:
Add scanlines and glitch? "This character is from 1985!"
It's the same principle that made Max Headroom work. The chaos approximates life, but now with actual audio synchronization.
Uses ImageMagick to rotate each frame around its center:
convert input.jpg \
-resize 320x320^ \
-gravity center \
-extent 320x320 \
-background none \
-rotate 12.5 \
-extent 320x320 \
output.png
The double extent ensures the rotated image stays centered and doesn't get cropped.
Scanlines (Level 1+):
geq='r=r(X,Y):g=g(X,Y):b=b(X,Y):a=if(not(mod(Y\,3)),255,a(X,Y))'
Makes every 3rd line more opaque.
Noise (Level 2+):
noise=alls=10:allf=t+u
Temporal noise that varies per frame.
Chromatic Aberration (Level 3):
split, offset red/green channels, overlay
Simulates lens distortion.
ffmpeg's overlay filter with alpha channel:
[1:v]format=yuva420p[overlay];
[0:v][overlay]overlay=x:y:shortest=1
Positions calculated dynamically based on video size.
For best results:
# Generate separate headrooms for different speakers
caricature --frames "host*.jpg" -o host.mp4
caricature --frames "guest*.jpg" -o guest.mp4
# Overlay both (requires manual ffmpeg)
ffmpeg -i video.mp4 -i host.mp4 -i guest.mp4 \
-filter_complex "[0:v][1:v]overlay=W-w-20:H-h-20[tmp];[tmp][2:v]overlay=20:H-h-20" \
output.mp4
Corporate/Professional:
--glitch 0 - No effects--rotation 5 - Subtle movementRetro/Fun:
--glitch 2 - Medium chaos--rotation 15 - Default energyExperimental/Art:
--glitch 3 - Maximum chaos--rotation 45 - Wild rotationMore frames = more unique moments = more apparent "talking"
# Create frames:
# professor-closed1.jpg (mouth closed, explaining pose)
# professor-open1.jpg (mouth open, animated)
# professor-open2.jpg (mouth wider, emphasis)
# professor-open3.jpg (mouth open, different angle)
caricature \
--audio lecture.mp3 \
--character professor \
--threshold -35 \
--glitch 1 \
--rotation 10 \
--size 320 \
-o talking-prof.mp4
# Overlay on slides
caricature -O lecture-slides.mp4 -p top-right -o final-lecture.mp4
# Create frames with expressive mouth positions
# host-closed1.jpg, host-closed2.jpg
# host-open1.jpg, host-open2.jpg, host-open3.jpg
caricature \
--audio podcast-episode.mp3 \
--character host \
--threshold -38 \
--glitch 2 \
--rotation 15 \
--size 400 \
-o podcast-visual.mp4
# Sensitive threshold for dynamic narration
caricature \
--audio narration.mp3 \
--character narrator \
--threshold -40 \
--glitch 1 \
--size 256 \
-o narrator.mp4
# Overlay on main video
caricature -O main-video.mp4 -p bottom-right -m 30 -o final-video.mp4
# Create character1 (host)
caricature -a host-audio.mp3 -c character1 -t -35 -o host.mp4
# Create character2 (guest)
caricature -a guest-audio.mp3 -c character2 -t -37 -o guest.mp4
# Combine with ffmpeg (both corners)
ffmpeg -i video.mp4 -i host.mp4 -i guest.mp4 \
-filter_complex "[0:v][1:v]overlay=W-w-20:H-h-20[tmp];[tmp][2:v]overlay=20:H-h-20" \
-c:a copy final.mp4
caricature evolved from sardonic, which used pure random frame selection. The key improvements:
The original sardonic was beautiful chaos—random frames that your brain interpreted as talking. But with actual audio synchronization:
It's the best of both worlds: precision meets chaos.
The code is designed to be hackable. Want more effects?
// In generateAudioSequence(), vary threshold over time
const dynamicThreshold = this.loudnessThreshold + Math.sin(time) * 5;
const isMouthOpen = closestSample.loudness > dynamicThreshold;
// In generateAudioSequence(), occasionally repeat frames
if (Math.random() > 0.9) {
sequence.push({
...sequence[sequence.length - 1],
duration: 0.05
});
}
In createRotatedFrame():
const zoom = 100 + (Math.random() * 20 - 10); // 90-110%
args.push('-resize', `${zoom}%`);
Max Headroom (1985-1987) wasn't just a character. It was a statement about media, reality, and the future.
What made Max special:
caricature captures that spirit in miniature. A small chaos agent in your video corner, reminding viewers that media is constructed, reality is malleable, and cats in sunglasses are timeless.
"No frames found"
--frames pattern--dir /full/path/to/frames"ImageMagick failed"
apt install imagemagick or brew install imagemagick"Overlay looks wrong"
--position bottom-left--margin 50"Not glitchy enough"
--glitch 3--rotation 30"Too glitchy"
--glitch 0 or --glitch 1--rotation 5FAQs
Caricature is a corner-dwelling reporter synchronized to audio volume
The npm package caricature receives a total of 0 weekly downloads. As such, caricature popularity was classified as not popular.
We found that caricature demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
Multiple high-impact npm maintainers confirm they have been targeted in the same social engineering campaign that compromised Axios.

Security News
Axios compromise traced to social engineering, showing how attacks on maintainers can bypass controls and expose the broader software supply chain.

Security News
Node.js has paused its bug bounty program after funding ended, removing payouts for vulnerability reports but keeping its security process unchanged.