CLI Whisperer

A professional voice-to-text terminal user interface (TUI) application that combines the power of OpenAI's Whisper for speech recognition with GPT for intelligent text formatting. Features a modern, responsive interface with comprehensive export capabilities, Spotify integration, and advanced recording controls.
Features
Audio & Recording
- High-quality audio recording with configurable duration (15s - 5min+)
- Real-time audio level meter with waveform visualization
- Adjustable recording controls with preset duration buttons
- Graceful recording management with manual stop capability
- Minimum recording length validation for quality assurance
AI-Powered Transcription
- OpenAI Whisper integration for accurate speech-to-text
- Multiple Whisper model support (tiny, base, small, medium, large)
- Intelligent text formatting with OpenAI GPT models
- Dual transcription modes - raw and AI-enhanced text
- Comprehensive error handling with fallback mechanisms
Modern TUI Interface
- 8 professional themes (EDM Synthwave, Cyberpunk, Marc Anthony, Professional, etc.)
- Responsive design optimized for all terminal sizes
- Tabbed interface with smooth navigation
- Real-time status updates and progress indicators
- Pulse animations and visual feedback systems
Spotify Integration
- Playback control (play/pause, next/previous, shuffle, repeat)
- Real-time status display with track information
- Interactive controls directly in the TUI
- Smart auto-pause during recording sessions
Advanced Export System
- 6 export formats: TXT, Markdown, JSON, CSV, DOCX, PDF
- Batch export capabilities for all transcriptions
- Filtering options by date, directory, and text content
- Metadata inclusion with timestamps and file paths
- Custom output locations and file naming
Comprehensive Keyboard Shortcuts
- 38 keyboard shortcuts for all major functions
- Power-user optimized workflow
- Intuitive key bindings following standard conventions
- Context-sensitive help system
File Management
- Intelligent file organization with automatic rotation
- History tracking with searchable database
- Directory-aware storage with working directory tracking
- Automatic cleanup of old files
- Backup and recovery systems
Table of Contents
Installation
Prerequisites
- Python 3.10+ (required for OpenAI Whisper compatibility)
- pip or uv package manager
- OpenAI API key (optional, for text formatting)
- Microphone access for recording
- Spotify CLI (optional, for music integration)
Quick Install with UV (Recommended)
uv pip install -e .
git clone https://github.com/VinnyVanGogh/cli-whisperer.git
cd cli-whisperer
uv pip install -e .
Install with Pip
git clone https://github.com/VinnyVanGogh/cli-whisperer.git
cd cli-whisperer
python -m venv venv
source venv/bin/activate
pip install -e .
System Dependencies
brew install portaudio
sudo apt-get install portaudio19-dev python3-pyaudio
Quick Start
1. Basic Recording
cli-whisperer
cli-whisperer --duration 120 --format
cli-whisperer --once
2. TUI Mode
cli-whisperer --tui
cli-whisperer --tui --theme professional
3. Configuration
export OPENAI_API_KEY="your-api-key-here"
cli-whisperer --output-dir ~/Documents/transcripts
Usage
Command Line Interface
cli-whisperer [OPTIONS]
Options:
--tui Launch interactive TUI mode
--once Record once and exit
-d, --duration SECONDS Recording duration (default: 120)
-min, --minutes MIN Recording duration in minutes
--format Enable OpenAI text formatting
--no-format Disable OpenAI text formatting
--model MODEL Whisper model (tiny/base/small/medium/large)
--openai-model MODEL OpenAI model for formatting
--theme THEME TUI theme selection
--output-dir PATH Custom output directory
--cleanup-days DAYS Days to keep old files (default: 7)
--debug Enable debug logging
--help Show help message
TUI Mode Features
Recording Controls
- Record Button: Start recording session
- Stop Button: End recording early
- Duration Controls: Adjust recording time (±15s increments)
- Preset Buttons: Quick duration selection (30s, 1m, 2m, 5m)
Real-time Feedback
- Audio Level Meter: Visual waveform with color coding
- Progress Bar: Recording countdown with time remaining
- Status Panel: Current mode and session information
Text Management
- Tabbed Previews: Switch between raw and AI-formatted text
- Copy Functions: One-click copying to clipboard
- Edit Integration: Direct Neovim editing support
Keyboard Shortcuts
Core Actions
R | Record | Start recording |
S | Stop | Stop recording |
Space | Toggle Recording | Start/stop recording |
Q / Escape | Quit | Exit application |
Navigation
Tab / Shift+Tab | Navigate Tabs | Switch between tabs |
H | History | Show history tab |
T | Themes | Show themes tab |
F1 / ? | Help | Show help dialog |
Duration Controls
+ / - | Adjust Duration | Increase/decrease by 15s |
1 - 4 | Duration Presets | Set 30s, 1m, 2m, 5m |
Copy Operations
C | Copy AI Text | Copy formatted transcription |
Ctrl+C | Copy Raw Text | Copy original transcription |
Ctrl+A | Enhanced Copy | Copy with preview |
Ctrl+Shift+A | Copy All | Copy all transcriptions |
Spotify Controls
Ctrl+P | Play/Pause | Toggle playback |
Ctrl+N / Ctrl+B | Next/Previous | Track navigation |
Ctrl+S | Toggle Panel | Show/hide Spotify panel |
Ctrl+Shift+S | Shuffle | Toggle shuffle mode |
Ctrl+Shift+R | Repeat | Toggle repeat mode |
File Operations
Ctrl+E | Export | Export current transcription |
Ctrl+Shift+E | Export All | Export all transcriptions |
Ctrl+O | Open Directory | Open transcript folder |
Ctrl+D | Clean Files | Delete old files |
Advanced Features
F2 | Toggle Debug | Enable/disable debug mode |
F3 | Toggle Audio Meter | Show/hide audio meter |
F4 | Compact Mode | Toggle compact layout |
F5 | Refresh | Refresh interface |
Ctrl+R | Reload Config | Reload configuration |
Ctrl+Shift+T | Switch Theme | Cycle through themes |
Configuration
Environment Variables
export OPENAI_API_KEY="sk-your-api-key-here"
export OPENAI_MODEL="gpt-4"
export CLI_WHISPERER_OUTPUT_DIR="~/Documents/transcripts"
export CLI_WHISPERER_THEME="professional"
export CLI_WHISPERER_DEBUG="false"
export CLI_WHISPERER_DURATION="120"
export CLI_WHISPERER_MODEL="base"
export CLI_WHISPERER_MIN_LENGTH="1.0"
Configuration Files
The application uses the following configuration structure:
~/.config/cli-whisperer/
├── config.yaml # Main configuration
├── themes/ # Custom themes
│ ├── custom.css
│ └── user-theme.css
└── history/ # History database
├── history.json
└── backups/
Custom Themes
Create custom themes by extending the base theme system:
:root {
--primary-color: #your-color;
--secondary-color: #your-color;
--accent-color: #your-color;
--background-color: #your-color;
}
RecordingControls {
background: var(--background-color);
border: solid var(--primary-color);
}
Export Functionality
Supported Formats
Plain Text | .txt | Simple text format | Optional |
Markdown | .md | Formatted with headers | Full |
JSON | .json | Structured data | Complete |
CSV | .csv | Spreadsheet compatible | Basic |
Word Document | .docx | Microsoft Word | Full |
PDF | .pdf | Portable document | Complete |
Export Options
Content Selection
- Raw transcription text
- AI-formatted text
- Timestamps and metadata
- File paths and working directory
- Recording duration and model info
Filtering (History Export)
- Date Range: Export transcriptions from specific time periods
- Directory Filter: Export only from specific working directories
- Text Search: Export transcriptions containing specific keywords
- Model Filter: Export by Whisper model used
Export Types
Ctrl+E
Ctrl+Shift+E
Themes
Built-in Themes
EDM Synthwave | Retro neon aesthetic | Hot pink, electric cyan, yellow |
EDM Cyberpunk | Futuristic dark theme | Cyan, green, deep pink |
EDM Trance | Clean electronic look | Blue, purple, white |
Marc Anthony | Elegant gold theme | Platinum, champagne, rose gold |
Professional | Business-friendly | Blue, gray, green |
Dark Minimal | Clean dark interface | White, gray, blue |
Neon Noir | High contrast neon | Pink, cyan, yellow |
Retro Wave | 80s inspired | Pink, purple, orange |
Theme Switching
cli-whisperer --tui --theme professional
T
Ctrl+Shift+T
Development
Project Structure
cli-whisperer/
├── src/cli_whisperer/
│ ├── core/ # Core functionality
│ │ ├── audio_recorder.py # Audio recording and processing
│ │ ├── transcriber.py # Whisper integration
│ │ ├── formatter.py # OpenAI text formatting
│ │ └── file_manager.py # File operations
│ ├── integrations/ # External integrations
│ │ ├── spotify_control.py # Spotify API integration
│ │ └── clipboard.py # System clipboard
│ ├── ui/ # User interface
│ │ ├── textual_app.py # Main TUI application
│ │ ├── themes.py # Theme system
│ │ ├── export_dialog.py # Export dialogs
│ │ └── edit_manager.py # Neovim integration
│ ├── utils/ # Utilities
│ │ ├── config.py # Configuration management
│ │ ├── logger.py # Logging system
│ │ ├── history.py # History management
│ │ └── export_manager.py # Export functionality
│ ├── cli.py # CLI interface
│ └── main.py # Entry point
├── tests/ # Test suite
│ ├── test_export_manager.py
│ └── ...
├── pyproject.toml # Project configuration
└── README.md # This file
Development Setup
git clone https://github.com/VinnyVanGogh/cli-whisperer.git
cd cli-whisperer
python -m venv venv
source venv/bin/activate
pip install -e ".[dev]"
pre-commit install
Running Tests
pytest
pytest --cov=src/cli_whisperer
pytest tests/test_export_manager.py
pytest -v
Code Quality
black src/ tests/
mypy src/cli_whisperer
flake8 src/ tests/
pre-commit run --all-files
API Reference
Core Classes
CLIApplication
Main application orchestrator that coordinates all components.
from cli_whisperer.cli import CLIApplication
app = CLIApplication(
duration=120,
format_enabled=True,
model="base",
output_dir="./transcripts"
)
app.run()
AudioRecorder
Handles audio recording with real-time level monitoring.
from cli_whisperer.core.audio_recorder import AudioRecorder
recorder = AudioRecorder(
duration=60,
sample_rate=16000,
channels=1
)
audio_data = recorder.record()
WhisperTranscriber
Manages Whisper model loading and transcription.
from cli_whisperer.core.transcriber import WhisperTranscriber
transcriber = WhisperTranscriber(model="base")
text = transcriber.transcribe(audio_data)
ExportManager
Handles multi-format export functionality.
from cli_whisperer.utils.export_manager import ExportManager, ExportFormat
manager = ExportManager()
manager.export_transcription(
text="Hello world",
format=ExportFormat.MARKDOWN,
output_path="output.md"
)
Integration Points
Spotify Integration
from cli_whisperer.integrations.spotify_control import SpotifyController
spotify = SpotifyController()
if spotify.is_available():
spotify.play()
status = spotify.get_status()
Theme System
from cli_whisperer.ui.themes import ThemeManager
theme_manager = ThemeManager()
theme_manager.set_theme("professional")
css = theme_manager.get_current_theme().css
Troubleshooting
Common Issues
Audio Recording Problems
python -c "import sounddevice as sd; print(sd.query_devices())"
OpenAI API Issues
echo $OPENAI_API_KEY
python -c "import openai; print(openai.models.list())"
Whisper Model Loading
rm -rf ~/.cache/whisper
python -c "import whisper; whisper.load_model('base')"
Debug Mode
Enable debug logging for detailed troubleshooting:
cli-whisperer --debug
export CLI_WHISPERER_DEBUG=true
F2
Performance Optimization
For Low-End Systems
cli-whisperer --model tiny
cli-whisperer --duration 30
cli-whisperer --no-format
For High-End Systems
cli-whisperer --model large
cli-whisperer --format --tui --theme professional
Log Files
Check log files for detailed error information:
tail -f ~/.local/share/cli-whisperer/logs/cli-whisperer.log
tail -f ~/.local/share/cli-whisperer/logs/debug.log
Contributing
We welcome contributions! Please see our Contributing Guide for details.
Development Process
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
)
- Make your changes following the code style guidelines
- Add tests for your changes
- Ensure all tests pass (
pytest
)
- Update documentation if needed
- Commit your changes (
git commit -m 'Add amazing feature'
)
- Push to the branch (
git push origin feature/amazing-feature
)
- Open a Pull Request
Code Style Guidelines
- Follow PEP 8 Python style guide
- Use type hints for all functions and methods
- Write docstrings in Google style
- Keep functions under 50 lines when possible
- Maintain test coverage above 90%
Issue Reports
When reporting issues, please include:
- Python version and operating system
- Complete error messages and stack traces
- Steps to reproduce the issue
- Expected vs actual behavior
- Log files if applicable
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- OpenAI for the Whisper and GPT models
- Textual for the excellent TUI framework
- Python Community for the amazing ecosystem
- All contributors who have helped improve this project
Support
Made with ❤️ by VinnyVanGogh
Transforming voice to text with style and intelligence
⬆️ Back to Top