Next time your Claude Code context is running low, just quit cc then run npx ccprune - it auto-resumes you back into your last thread, now compacted with an intelligent rolling summary. Run it again when you're low again - the summaries stack, so context just keeps rolling forward.

Fork of claude-prune with enhanced features: percentage-based pruning, AI summarization enabled by default, and improved UX.

Features

Zero-Config Default: Just run ccprune - auto-detects latest session, keeps 55K tokens (~70K after Claude Code adds system context)
Token-Based Pruning: Prunes based on actual token count, not message count
Smart Threshold: Automatically skips pruning if session is under 55K tokens
AI Summarization: Automatically generates a summary of pruned content (enabled by default)
Summary Synthesis: Re-pruning synthesizes old summary + new pruned content into one cohesive summary
Small Session Warning: Prompts for confirmation when auto-selecting sessions with < 5 messages
Safe by Default: Always preserves session summaries and metadata
Auto Backup: Creates timestamped backups before modifying files
Restore Support: Easily restore from backups with the restore command
Dry-Run Preview: Preview changes and summary before committing

Installation

Run directly (recommended)

# Using npx (Node.js)
npx ccprune

# Using bunx (Bun)
bunx ccprune

Install globally

# Using npm
npm install -g ccprune

# Using bun
bun install -g ccprune

Quick Start

Quit Claude Code - Press Ctrl+C or type /quit
Run prune from the same project directory:
```
npx ccprune
```

That's it! ccprune auto-detects your latest session, prunes old messages (keeping a summary), and resumes automatically.

Setup (Recommended)

For fast, high-quality summarization, set up a Gemini API key:

Get a free key from Google AI Studio
Add to your shell profile (~/.zshrc or ~/.bashrc):
```
export GEMINI_API_KEY=your_key
```
Restart your terminal or run source ~/.zshrc

With GEMINI_API_KEY set, ccprune automatically uses Gemini 2.5 Flash for fast summarization without chunking.

Note: If GEMINI_API_KEY is not set, ccprune automatically falls back to Claude Code CLI for summarization (no additional setup required).

Usage

# Zero-config: auto-detects latest session, keeps 55K tokens
ccprune

# Pick from available sessions interactively
ccprune --pick

# Explicit session ID (if you need a specific session)
ccprune <sessionId>

# Explicit token limit
ccprune --keep 55000
ccprune --keep-tokens 80000

# Subcommands
ccprune restore <sessionId> [--dry-run]

Arguments

sessionId: (Optional) UUID of the Claude Code session. Auto-detects latest if omitted.

Subcommands

Subcommand	Description
`restore <sessionId>`	Restore a session from the latest backup
`restore <sessionId> --dry-run`	Preview restore without making changes

Options

Option	Description
`--pick`	Interactively select from available sessions
`-n, --no-resume`	Skip automatic session resume
`--yolo`	Resume with `--dangerously-skip-permissions`
`--resume-model <model>`	Model for resumed session (opus, sonnet, haiku, opusplan)
`-k, --keep <number>`	Number of tokens to retain (default: 55000)
`--keep-tokens <number>`	Number of tokens to retain (alias for `-k`)
`--dry-run`	Preview changes and summary without modifying files
`--no-summary`	Skip AI summarization of pruned messages
`--summary-model <model>`	Model for summarization (haiku, sonnet, or full name)
`--summary-timeout <ms>`	Timeout for summarization in milliseconds (default: 360000)
`--gemini`	Use Gemini 3 Pro for summarization
`--gemini-flash`	Use Gemini 2.5 Flash for summarization
`--claude-code`	Use Claude Code CLI for summarization (chunks large transcripts)
`--prune-tools`	Replace all non-protected tool outputs with placeholders
`--prune-tools-ai`	Use AI to identify which tool outputs to prune
`--prune-tools-dedup`	Deduplicate identical tool calls, keep only most recent
`--prune-tools-max`	Maximum savings: dedup + AI analysis combined
`--prune-tools-keep <tools>`	Comma-separated tools to never prune (default: Edit,Write,TodoWrite,TodoRead,AskUserQuestion)
`-h, --help`	Show help information
`-V, --version`	Show version number

If no session ID is provided, auto-detects the most recently modified session. If no keep option is specified, defaults to 55,000 tokens (~70K actual context after Claude Code adds system prompt and CLAUDE.md).

Summarization priority:

--claude-code flag: Force Claude Code CLI (chunks transcripts >30K chars)
--gemini or --gemini-flash flags: Use Gemini API
Auto-detect: If GEMINI_API_KEY is set, uses Gemini 2.5 Flash
Fallback: Claude Code CLI (no API key needed)

Examples

# Simplest: auto-detect, prune, and resume automatically
npx ccprune

# Prune only (don't resume)
npx ccprune -n

# Resume in yolo mode (--dangerously-skip-permissions)
npx ccprune --yolo

# Resume with a specific model (e.g., Opus 4.5)
npx ccprune --resume-model opus

# Combine yolo mode with Opus
npx ccprune --yolo --resume-model opus

# Pick from available sessions interactively
npx ccprune --pick

# Keep 55K tokens (default)
npx ccprune --keep 55000

# Keep 80K tokens (less aggressive pruning)
npx ccprune --keep-tokens 80000

# Preview what would be pruned (shows summary preview too)
npx ccprune --dry-run

# Skip summarization for faster pruning
npx ccprune --no-summary

# Use Claude Code CLI with haiku model (faster/cheaper)
npx ccprune --claude-code --summary-model haiku

# Use Gemini 3 Pro for summarization
npx ccprune --gemini

# Use Gemini 2.5 Flash (default when GEMINI_API_KEY is set)
npx ccprune --gemini-flash

# Force Claude Code CLI for summarization
npx ccprune --claude-code

# Target a specific session by ID
npx ccprune 03953bb8-6855-4e53-a987-e11422a03fc6 --keep 55000

# Restore from the latest backup
npx ccprune restore 03953bb8-6855-4e53-a987-e11422a03fc6

Tool Output Pruning (Default)

Tool pruning runs automatically to reduce tokens before summarization:

Dedup: Identical tool calls are deduplicated (keeps only most recent)
AI analysis: Intelligently prunes irrelevant outputs using your summarization backend

# Default behavior (dedup + AI) - runs automatically
ccprune

# Disable automatic tool pruning
ccprune --skip-tool-pruning

# Explicit modes for specific behavior:
ccprune --prune-tools          # Simple: replace ALL outputs (no AI)
ccprune --prune-tools-dedup    # Dedup only (no AI)
ccprune --prune-tools-ai       # AI only (no dedup)
ccprune --prune-tools-max      # Explicit dedup + AI (same as default)

# Custom protected tools
ccprune --prune-tools-keep "Edit,Write,Bash"

Protected tools (never pruned by default):

Edit, Write - file modification context
TodoWrite, TodoRead - task tracking
AskUserQuestion - user interaction

Modes explained:

Default (no flags): Runs dedup first (free), then AI analysis - maximum savings
Simple (--prune-tools): Replaces all non-protected tool outputs with [Pruned: {tool} output - {bytes} bytes]
AI (--prune-tools-ai): Uses your summarization backend (Gemini or Claude Code CLI) to intelligently identify which outputs are no longer relevant
Dedup (--prune-tools-dedup): Keeps only the most recent output when the same tool is called with identical input. Annotates with [{total} total calls]
Skip (--skip-tool-pruning): Disable automatic tool pruning entirely

How It Works

BEFORE                 AFTER FIRST PRUNE           AFTER RE-PRUNE
──────                 ────────────────            ──────────────
┌───────────────┐      ┌───────────────┐           ┌───────────────┐
│ msg 1 (old)   │─┐    │ [SUMMARY]     │─┐         │ [NEW SUMMARY] │ ◄─ synthesized
│ msg 2 (old)   │ │    │ "Previously.."│ │         │ (old+middle)  │
│ ...           │ ├──► ├───────────────┤ │         ├───────────────┤
│ msg N (old)   │─┘    │ msg N+1 (kept)│ ├───────► │ msg X (kept)  │
├───────────────┤      │ msg N+2 (kept)│ │         │ msg Y (kept)  │
│ msg N+1 (new) │─────►│ msg N+3 (kept)│─┘         │ msg Z (kept)  │
│ msg N+2 (new) │      └───────────────┘           └───────────────┘
│ msg N+3 (new) │
└───────────────┘       ▲                           ▲
                        │                           │
                   old msgs become             old summary + middle
                   summary, recent kept        synthesized, recent kept

Locates Session File: Finds $CLAUDE_CONFIG_DIR/projects/{project-path}/{sessionId}.jsonl
Counts Tokens: Uses Claude's cumulative usage data from the last message: input_tokens + cache_read_input_tokens + cache_creation_input_tokens. This matches Claude Code's UI display exactly
Early Exit: If total tokens ≤ threshold (55K default), skips pruning and auto-resumes
Preserves Critical Data: Always keeps the first line (file-history-snapshot or session metadata)
Token-Based Cutoff: Scans right-to-left, accumulating tokens until adding the next message would exceed the threshold
Content Extraction: Extracts text from messages, including tool_result outputs and thinking blocks. Tool calls become [Used tool: ToolName] placeholders to provide context without verbose tool I/O
Orphan Cleanup: Removes tool_result blocks in kept messages that reference tool_use blocks from pruned messages
AI Summarization: Generates a structured summary with sections: Overview, What Was Accomplished, Files Modified, Key Technical Details, Current State & Pending Work
Summary Synthesis: Re-pruning synthesizes old summary + new pruned content into one cohesive summary
- Gemini (default with API key): Handles large transcripts natively without chunking
- Claude Code CLI (fallback): May chunk transcripts >30K characters (see Claude Code CLI Summarization below)
Safe Backup: Creates timestamped backup in prune-backup/ before modifying
Auto-Resume: Optionally resumes Claude Code session after pruning

Claude Code CLI Summarization

When using the --claude-code flag (or when GEMINI_API_KEY is not set), ccprune uses the Claude Code CLI for summarization with these specific behaviors:

Chunking for Large Transcripts:

Transcripts >30,000 characters are automatically split into chunks
Each chunk is summarized independently
Chunk summaries are then combined into a final unified summary
Why: Ensures reliable summarization even for very long sessions

Model Selection:

Default: Uses your Claude Code CLI default model
Override with --summary-model haiku or --summary-model sonnet
Supports full model names (e.g., claude-3-5-sonnet-20241022)

Timeout & Retries:

Default timeout: 360 seconds (6 minutes)
Override with --summary-timeout <ms>
Automatic retries: Up to 2 attempts on failure

When to Use:

No API key required (uses existing Claude Code subscription)
Handles extremely large transcripts via chunking
Works offline (if Claude Code CLI works offline)

Trade-offs:

Slower than Gemini API (spawns subprocess)
Chunking may lose some context coherence for very large sessions
Requires Claude Code CLI to be installed and authenticated

File Structure

Claude Code stores sessions in:

~/.claude/projects/{project-path-with-hyphens}/{sessionId}.jsonl

For example, a project at /Users/alice/my-app becomes:

~/.claude/projects/-Users-alice-my-app/{sessionId}.jsonl

Environment Variables

CLAUDE_CONFIG_DIR

By default, ccprune looks for session files in ~/.claude. If Claude Code is configured to use a different directory, you can specify it with the CLAUDE_CONFIG_DIR environment variable:

CLAUDE_CONFIG_DIR=/custom/path/to/claude ccprune

GEMINI_API_KEY

When set, ccprune automatically uses Gemini 2.5 Flash for summarization (recommended). Get your free API key from Google AI Studio.

export GEMINI_API_KEY=your_api_key_here
ccprune  # automatically uses Gemini 2.5 Flash

Use --gemini for Gemini 3 Pro, or --claude-code to force Claude Code CLI.

Migrating from claude-prune

If you were using the original claude-prune package, ccprune v3.x has these changes:

# claude-prune v1.x (message-count based, summary was opt-in)
claude-prune <id> -k 10 --summarize-pruned

# ccprune v2.x (percentage-based, summary enabled by default)
ccprune <id>                    # defaults to 20% of messages
ccprune <id> --keep-percent 25  # keep latest 25% of messages

# ccprune v3.x (token-based, summary enabled by default)
ccprune <id>                    # defaults to 55K tokens
ccprune <id> -k 55000           # keep 55K tokens
ccprune <id> --keep-tokens 80000 # keep 80K tokens

Key changes in v3.x:

Token-based pruning: -k now means tokens, not message count
Removed: -p, --keep-percent flag (replaced by token-based approach)
Auto-skip: Sessions under 55K tokens are not pruned
Lenient boundary: Includes one extra message at the boundary to preserve context
Summary is enabled by default (use --no-summary to disable)
Re-pruning synthesizes old summary + new pruned content into one summary

Key changes in v4.x:

Accurate token counting: Uses Claude's cumulative usage data (input_tokens + cache_read + cache_creation) to match Claude Code UI
Proportional scaling: Per-message tokens are scaled to match total context for accurate pruning
--resume-model: Specify which model to use when auto-resuming (opus, sonnet, haiku, opusplan)
55K default: Results in ~70K total context after Claude Code adds system prompt, CLAUDE.md, and other overhead

Development

# Clone and install
git clone https://github.com/nicobailon/claude-prune.git
cd claude-prune
bun install

# Run tests
bun run test

# Build
bun run build

# Test locally
./dist/index.js --help

Credits

This project is a fork of claude-prune by Danny Aziz. Thanks for the original implementation!

License

MIT

Keywords

FAQs

What is ccprune?

Is ccprune popular?

Is ccprune well maintained?

Package last updated on 12 Dec 2025

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

ccprune