Git Commit Analyzer
A TypeScript/Node.js program that analyzes git commits and generates categorized
summaries using Claude CLI.
Features
- Extract commit details (message, date, diff) from git repositories
- Categorize commits using LLM analysis into:
tweak, feature, or process
- Generate CSV reports with timestamp, category, summary, and description
- Generate condensed markdown reports from CSV data for stakeholder
communication
- Support for multiple LLM models (Claude, Gemini, OpenAI) with automatic
detection
- Support for batch processing multiple commits
- Automatically filters out merge commits for cleaner analysis
- Robust error handling and validation
Usage
Default Behavior
When run without arguments, the program analyzes all commits authored by the
current user:
npx commit-analyzer
npx commit-analyzer --limit 10
npx commit-analyzer --author user@example.com
Command Line Arguments
npx commit-analyzer abc123 def456 ghi789
npx commit-analyzer --output analysis.csv --limit 20
npx commit-analyzer --report --input-csv analysis.csv
npx commit-analyzer --report --limit 50
npx commit-analyzer --llm claude --limit 10
Options
-o, --output <file>:
Output file (default:
results/commits.csv for analysis, results/report.md for reports)
--output-dir <dir>:
Output directory for CSV and report files (default:
current directory)
-a, --author <email>:
Filter commits by author email (defaults to current user)
-l, --limit <number>:
Limit number of commits to analyze
--llm <model>:
LLM model to use (claude, gemini, openai)
-r, --resume:
Resume from last checkpoint if available
-c, --clear:
Clear any existing progress checkpoint
--report:
Generate condensed markdown report from existing CSV
--input-csv <file>:
Input CSV file to read for report generation
-v, --verbose:
Enable verbose logging (shows detailed error information)
--since <date>:
Only analyze commits since this date (YYYY-MM-DD, '1 week ago', '2024-01-01')
--until <date>:
Only analyze commits until this date (YYYY-MM-DD, '1 day ago', '2024-12-31')
--no-cache:
Disable caching of analysis results
--batch-size <number>:
Number of commits to process per batch (default:
1 for sequential processing)
-h, --help:
Display help
-V, --version:
Display version
Output Formats
CSV Output
The program generates a CSV file with the following columns:
timestamp:
ISO 8601 timestamp of the commit (e.g., 2025-08-28T11:14:40.000Z)
category:
One of tweak, feature, or process
summary:
One-line description (max 80 characters)
description:
Detailed explanation (2-3 sentences)
Markdown Report Output
When using the --report option, the program generates a condensed markdown
report that:
- Groups commits by year (most recent first)
- Organizes by categories:
Features, Processes, Tweaks & Bug Fixes
- Consolidates similar items for stakeholder readability
- Includes commit count statistics
- Uses professional language suitable for both technical and non-technical
audiences
Requirements
- Node.js 18+ with TypeScript support (Bun runtime recommended)
- Git repository (must be run within a git repository)
- At least one supported LLM CLI tool:
- Claude CLI (
claude) - recommended, defaults to Sonnet model
- Gemini CLI (
gemini)
- OpenAI CLI (
codex)
- Valid git commit hashes (when specifying commits manually)
Categories
- tweak:
Minor adjustments, bug fixes, small improvements
- feature:
New functionality, major additions
- process:
Build system, CI/CD, tooling, configuration changes
Error Handling
The program includes comprehensive error handling for:
- Invalid commit hashes
- Git repository validation
- LLM analysis failures with automatic retry
- File I/O errors
- Network connectivity issues
Resume Capability
The tool automatically:
- Saves progress checkpoints every 10 commits
- Saves immediately when a failure occurs
- Stops processing after a commit fails all retry attempts
- Exports partial results to the CSV file before exiting
If the process stops (e.g., after 139 commits due to API failure), you can
resume from where it left off:
npx commit-analyzer --resume
npx commit-analyzer --clear
npx commit-analyzer --resume
The checkpoint file (.commit-analyzer/progress.json) contains:
- List of all commits to process
- Successfully processed commits (including failed ones to skip on resume)
- Analyzed commit data (only successful ones)
- Output file location
Application Data Directory
The tool creates a .commit-analyzer/ directory to store internal files:
.commit-analyzer/
├── progress.json # Progress checkpoint data
└── cache/ # Cached analysis results
├── commit-abc123.json
├── commit-def456.json
└── ...
- Progress checkpoint:
Enables resuming interrupted analysis sessions
- Analysis cache:
Stores LLM analysis results to avoid re-processing the same commits (TTL:
30 days)
Use --no-cache to disable caching if needed.
Use --clear to clear the cache and progress checkpoint.
Date Filtering
The tool supports flexible date filtering using natural language or specific
dates:
npx commit-analyzer --since "1 week ago"
npx commit-analyzer --since "2024-01-01" --until "2024-12-31"
npx commit-analyzer --since "2024-01-01"
npx commit-analyzer --until "2024-06-30"
Date formats supported:
- Relative dates:
"1 week ago", "2 months ago", "3 days ago"
- ISO dates:
"2024-01-01", "2024-12-31"
- Git-style dates:
Any format accepted by
git log --since and git log --until
Batch Processing
Control processing speed and resource usage with batch size options:
npx commit-analyzer --batch-size 1
npx commit-analyzer --batch-size 5 --limit 100
npx commit-analyzer --batch-size 1 --limit 500
Retry Logic
The tool includes automatic retry logic with exponential backoff for handling
API failures when processing many commits.
This is especially useful when analyzing large numbers of commits that might
trigger rate limits.
Configuration
You can configure the retry behavior using environment variables:
LLM_MAX_RETRIES:
Maximum number of retry attempts (default:
3)
LLM_INITIAL_RETRY_DELAY:
Initial delay between retries in milliseconds (default:
5000)
LLM_MAX_RETRY_DELAY:
Maximum delay between retries in milliseconds (default:
30000)
LLM_RETRY_MULTIPLIER:
Multiplier for exponential backoff (default:
2)
Examples
LLM_MAX_RETRIES=5 LLM_INITIAL_RETRY_DELAY=10000 npx commit-analyzer --limit 200
LLM_MAX_RETRIES=2 LLM_INITIAL_RETRY_DELAY=2000 npx commit-analyzer
LLM_MAX_RETRIES=4 LLM_INITIAL_RETRY_DELAY=15000 LLM_MAX_RETRY_DELAY=60000 npx commit-analyzer
The retry mechanism automatically:
- Retries failed API calls with increasing delays
- Shows progress and retry attempts in the console
- Continues processing remaining commits even if some fail
- Reports the total number of successful and failed commits at the end
Development
bun install
bun run dev
bun run build
bun run lint
bun run typecheck
Examples
npx commit-analyzer
npx commit-analyzer --limit 20 --output my_analysis.csv
npx commit-analyzer --author teammate@company.com --limit 50
npx commit-analyzer --limit 10
npx commit-analyzer --report --limit 100 --output yearly_analysis.csv
npx commit-analyzer --report --input-csv existing_analysis.csv --output team_report.md
npx commit-analyzer --llm gemini --limit 25
npx commit-analyzer --resume
Development
This tool requires the Bun runtime.
Install it globally:
curl -fsSL https://bun.sh/install | bash
npm install -g bun
Installation
bun install
bun build
bun link
After linking, you can use commit-analyzer command globally.