You're Invited:Meet the Socket Team at BlackHat and DEF CON in Las Vegas, Aug 4-6.RSVP
Socket
Book a DemoInstallSign in
Socket

cli-whisperer

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

cli-whisperer

Voice to Text Tool with Smart File Management and OpenAI Formatting

1.0.0
pipPyPI
Maintainers
1

CLI Whisperer

Python OpenAI Textual Version License Tests

A professional voice-to-text terminal user interface (TUI) application that combines the power of OpenAI's Whisper for speech recognition with GPT for intelligent text formatting. Features a modern, responsive interface with comprehensive export capabilities, Spotify integration, and advanced recording controls.

Features

Audio & Recording

  • High-quality audio recording with configurable duration (15s - 5min+)
  • Real-time audio level meter with waveform visualization
  • Adjustable recording controls with preset duration buttons
  • Graceful recording management with manual stop capability
  • Minimum recording length validation for quality assurance

AI-Powered Transcription

  • OpenAI Whisper integration for accurate speech-to-text
  • Multiple Whisper model support (tiny, base, small, medium, large)
  • Intelligent text formatting with OpenAI GPT models
  • Dual transcription modes - raw and AI-enhanced text
  • Comprehensive error handling with fallback mechanisms

Modern TUI Interface

  • 8 professional themes (EDM Synthwave, Cyberpunk, Marc Anthony, Professional, etc.)
  • Responsive design optimized for all terminal sizes
  • Tabbed interface with smooth navigation
  • Real-time status updates and progress indicators
  • Pulse animations and visual feedback systems

Spotify Integration

  • Playback control (play/pause, next/previous, shuffle, repeat)
  • Real-time status display with track information
  • Interactive controls directly in the TUI
  • Smart auto-pause during recording sessions

Advanced Export System

  • 6 export formats: TXT, Markdown, JSON, CSV, DOCX, PDF
  • Batch export capabilities for all transcriptions
  • Filtering options by date, directory, and text content
  • Metadata inclusion with timestamps and file paths
  • Custom output locations and file naming

Comprehensive Keyboard Shortcuts

  • 38 keyboard shortcuts for all major functions
  • Power-user optimized workflow
  • Intuitive key bindings following standard conventions
  • Context-sensitive help system

File Management

  • Intelligent file organization with automatic rotation
  • History tracking with searchable database
  • Directory-aware storage with working directory tracking
  • Automatic cleanup of old files
  • Backup and recovery systems

Table of Contents

  • Installation
  • Quick Start
  • Usage
  • Keyboard Shortcuts
  • Configuration
  • Export Functionality
  • Themes
  • Development
  • API Reference
  • Troubleshooting
  • Contributing
  • License

Installation

Prerequisites

  • Python 3.10+ (required for OpenAI Whisper compatibility)
  • pip or uv package manager
  • OpenAI API key (optional, for text formatting)
  • Microphone access for recording
  • Spotify CLI (optional, for music integration)
# Install with UV (fastest method)
uv pip install -e .

# Or install from source
git clone https://github.com/VinnyVanGogh/cli-whisperer.git
cd cli-whisperer
uv pip install -e .

Install with Pip

# Clone the repository
git clone https://github.com/VinnyVanGogh/cli-whisperer.git
cd cli-whisperer

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -e .

System Dependencies

# macOS
brew install portaudio

# Ubuntu/Debian
sudo apt-get install portaudio19-dev python3-pyaudio

# Windows
# Install Visual Studio Build Tools
# PortAudio will be installed automatically

Quick Start

1. Basic Recording

# Start CLI Whisperer
cli-whisperer

# Record for 2 minutes with OpenAI formatting
cli-whisperer --duration 120 --format

# Record once and exit
cli-whisperer --once

2. TUI Mode

# Launch the interactive TUI
cli-whisperer --tui

# TUI with specific theme
cli-whisperer --tui --theme professional

3. Configuration

# Set up OpenAI API key
export OPENAI_API_KEY="your-api-key-here"

# Configure output directory
cli-whisperer --output-dir ~/Documents/transcripts

Usage

Command Line Interface

cli-whisperer [OPTIONS]

Options:
  --tui                   Launch interactive TUI mode
  --once                  Record once and exit
  -d, --duration SECONDS  Recording duration (default: 120)
  -min, --minutes MIN     Recording duration in minutes
  --format                Enable OpenAI text formatting
  --no-format             Disable OpenAI text formatting
  --model MODEL           Whisper model (tiny/base/small/medium/large)
  --openai-model MODEL    OpenAI model for formatting
  --theme THEME           TUI theme selection
  --output-dir PATH       Custom output directory
  --cleanup-days DAYS     Days to keep old files (default: 7)
  --debug                 Enable debug logging
  --help                  Show help message

TUI Mode Features

Recording Controls

  • Record Button: Start recording session
  • Stop Button: End recording early
  • Duration Controls: Adjust recording time (±15s increments)
  • Preset Buttons: Quick duration selection (30s, 1m, 2m, 5m)

Real-time Feedback

  • Audio Level Meter: Visual waveform with color coding
  • Progress Bar: Recording countdown with time remaining
  • Status Panel: Current mode and session information

Text Management

  • Tabbed Previews: Switch between raw and AI-formatted text
  • Copy Functions: One-click copying to clipboard
  • Edit Integration: Direct Neovim editing support

Keyboard Shortcuts

Core Actions

KeyActionDescription
RRecordStart recording
SStopStop recording
SpaceToggle RecordingStart/stop recording
Q / EscapeQuitExit application

Navigation

KeyActionDescription
Tab / Shift+TabNavigate TabsSwitch between tabs
HHistoryShow history tab
TThemesShow themes tab
F1 / ?HelpShow help dialog

Duration Controls

KeyActionDescription
+ / -Adjust DurationIncrease/decrease by 15s
1 - 4Duration PresetsSet 30s, 1m, 2m, 5m

Copy Operations

KeyActionDescription
CCopy AI TextCopy formatted transcription
Ctrl+CCopy Raw TextCopy original transcription
Ctrl+AEnhanced CopyCopy with preview
Ctrl+Shift+ACopy AllCopy all transcriptions

Spotify Controls

KeyActionDescription
Ctrl+PPlay/PauseToggle playback
Ctrl+N / Ctrl+BNext/PreviousTrack navigation
Ctrl+SToggle PanelShow/hide Spotify panel
Ctrl+Shift+SShuffleToggle shuffle mode
Ctrl+Shift+RRepeatToggle repeat mode

File Operations

KeyActionDescription
Ctrl+EExportExport current transcription
Ctrl+Shift+EExport AllExport all transcriptions
Ctrl+OOpen DirectoryOpen transcript folder
Ctrl+DClean FilesDelete old files

Advanced Features

KeyActionDescription
F2Toggle DebugEnable/disable debug mode
F3Toggle Audio MeterShow/hide audio meter
F4Compact ModeToggle compact layout
F5RefreshRefresh interface
Ctrl+RReload ConfigReload configuration
Ctrl+Shift+TSwitch ThemeCycle through themes

Configuration

Environment Variables

# OpenAI Configuration
export OPENAI_API_KEY="sk-your-api-key-here"
export OPENAI_MODEL="gpt-4"

# Application Settings
export CLI_WHISPERER_OUTPUT_DIR="~/Documents/transcripts"
export CLI_WHISPERER_THEME="professional"
export CLI_WHISPERER_DEBUG="false"

# Recording Settings
export CLI_WHISPERER_DURATION="120"
export CLI_WHISPERER_MODEL="base"
export CLI_WHISPERER_MIN_LENGTH="1.0"

Configuration Files

The application uses the following configuration structure:

~/.config/cli-whisperer/
├── config.yaml          # Main configuration
├── themes/              # Custom themes
│   ├── custom.css
│   └── user-theme.css
└── history/             # History database
    ├── history.json
    └── backups/

Custom Themes

Create custom themes by extending the base theme system:

/* ~/.config/cli-whisperer/themes/custom.css */
:root {
    --primary-color: #your-color;
    --secondary-color: #your-color;
    --accent-color: #your-color;
    --background-color: #your-color;
}

RecordingControls {
    background: var(--background-color);
    border: solid var(--primary-color);
}

Export Functionality

Supported Formats

FormatExtensionDescriptionMetadata
Plain Text.txtSimple text formatOptional
Markdown.mdFormatted with headersFull
JSON.jsonStructured dataComplete
CSV.csvSpreadsheet compatibleBasic
Word Document.docxMicrosoft WordFull
PDF.pdfPortable documentComplete

Export Options

Content Selection

  • Raw transcription text
  • AI-formatted text
  • Timestamps and metadata
  • File paths and working directory
  • Recording duration and model info

Filtering (History Export)

  • Date Range: Export transcriptions from specific time periods
  • Directory Filter: Export only from specific working directories
  • Text Search: Export transcriptions containing specific keywords
  • Model Filter: Export by Whisper model used

Export Types

# Export latest transcription
Ctrl+E  # Interactive format selection

# Export current session
# Use Export Session button in Actions Panel

# Export filtered history
Ctrl+Shift+E  # Full export dialog with filtering

Themes

Built-in Themes

ThemeDescriptionColors
EDM SynthwaveRetro neon aestheticHot pink, electric cyan, yellow
EDM CyberpunkFuturistic dark themeCyan, green, deep pink
EDM TranceClean electronic lookBlue, purple, white
Marc AnthonyElegant gold themePlatinum, champagne, rose gold
ProfessionalBusiness-friendlyBlue, gray, green
Dark MinimalClean dark interfaceWhite, gray, blue
Neon NoirHigh contrast neonPink, cyan, yellow
Retro Wave80s inspiredPink, purple, orange

Theme Switching

# Command line
cli-whisperer --tui --theme professional

# In TUI
T                    # Open themes tab
Ctrl+Shift+T        # Quick theme cycle

Development

Project Structure

cli-whisperer/
├── src/cli_whisperer/
│   ├── core/                 # Core functionality
│   │   ├── audio_recorder.py # Audio recording and processing
│   │   ├── transcriber.py    # Whisper integration
│   │   ├── formatter.py      # OpenAI text formatting
│   │   └── file_manager.py   # File operations
│   ├── integrations/         # External integrations
│   │   ├── spotify_control.py # Spotify API integration
│   │   └── clipboard.py      # System clipboard
│   ├── ui/                   # User interface
│   │   ├── textual_app.py    # Main TUI application
│   │   ├── themes.py         # Theme system
│   │   ├── export_dialog.py  # Export dialogs
│   │   └── edit_manager.py   # Neovim integration
│   ├── utils/                # Utilities
│   │   ├── config.py         # Configuration management
│   │   ├── logger.py         # Logging system
│   │   ├── history.py        # History management
│   │   └── export_manager.py # Export functionality
│   ├── cli.py                # CLI interface
│   └── main.py               # Entry point
├── tests/                    # Test suite
│   ├── test_export_manager.py
│   └── ...
├── pyproject.toml           # Project configuration
└── README.md               # This file

Development Setup

# Clone the repository
git clone https://github.com/VinnyVanGogh/cli-whisperer.git
cd cli-whisperer

# Create development environment
python -m venv venv
source venv/bin/activate

# Install in development mode
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

Running Tests

# Run all tests
pytest

# Run tests with coverage
pytest --cov=src/cli_whisperer

# Run specific test file
pytest tests/test_export_manager.py

# Run tests with verbose output
pytest -v

Code Quality

# Format code
black src/ tests/

# Type checking
mypy src/cli_whisperer

# Linting
flake8 src/ tests/

# Run all quality checks
pre-commit run --all-files

API Reference

Core Classes

CLIApplication

Main application orchestrator that coordinates all components.

from cli_whisperer.cli import CLIApplication

app = CLIApplication(
    duration=120,
    format_enabled=True,
    model="base",
    output_dir="./transcripts"
)
app.run()

AudioRecorder

Handles audio recording with real-time level monitoring.

from cli_whisperer.core.audio_recorder import AudioRecorder

recorder = AudioRecorder(
    duration=60,
    sample_rate=16000,
    channels=1
)
audio_data = recorder.record()

WhisperTranscriber

Manages Whisper model loading and transcription.

from cli_whisperer.core.transcriber import WhisperTranscriber

transcriber = WhisperTranscriber(model="base")
text = transcriber.transcribe(audio_data)

ExportManager

Handles multi-format export functionality.

from cli_whisperer.utils.export_manager import ExportManager, ExportFormat

manager = ExportManager()
manager.export_transcription(
    text="Hello world",
    format=ExportFormat.MARKDOWN,
    output_path="output.md"
)

Integration Points

Spotify Integration

from cli_whisperer.integrations.spotify_control import SpotifyController

spotify = SpotifyController()
if spotify.is_available():
    spotify.play()
    status = spotify.get_status()

Theme System

from cli_whisperer.ui.themes import ThemeManager

theme_manager = ThemeManager()
theme_manager.set_theme("professional")
css = theme_manager.get_current_theme().css

Troubleshooting

Common Issues

Audio Recording Problems

# Check microphone permissions
# macOS: System Preferences > Security & Privacy > Microphone
# Linux: Check PulseAudio/ALSA configuration

# Test audio recording
python -c "import sounddevice as sd; print(sd.query_devices())"

OpenAI API Issues

# Verify API key
echo $OPENAI_API_KEY

# Test API connection
python -c "import openai; print(openai.models.list())"

Whisper Model Loading

# Clear model cache
rm -rf ~/.cache/whisper

# Download specific model
python -c "import whisper; whisper.load_model('base')"

Debug Mode

Enable debug logging for detailed troubleshooting:

# Command line
cli-whisperer --debug

# Environment variable
export CLI_WHISPERER_DEBUG=true

# In TUI
F2  # Toggle debug mode

Performance Optimization

For Low-End Systems

# Use smaller Whisper model
cli-whisperer --model tiny

# Reduce recording duration
cli-whisperer --duration 30

# Disable OpenAI formatting
cli-whisperer --no-format

For High-End Systems

# Use larger Whisper model
cli-whisperer --model large

# Enable all features
cli-whisperer --format --tui --theme professional

Log Files

Check log files for detailed error information:

# Application logs
tail -f ~/.local/share/cli-whisperer/logs/cli-whisperer.log

# Debug logs (when debug mode enabled)
tail -f ~/.local/share/cli-whisperer/logs/debug.log

Contributing

We welcome contributions! Please see our Contributing Guide for details.

Development Process

  • Fork the repository
  • Create a feature branch (git checkout -b feature/amazing-feature)
  • Make your changes following the code style guidelines
  • Add tests for your changes
  • Ensure all tests pass (pytest)
  • Update documentation if needed
  • Commit your changes (git commit -m 'Add amazing feature')
  • Push to the branch (git push origin feature/amazing-feature)
  • Open a Pull Request

Code Style Guidelines

  • Follow PEP 8 Python style guide
  • Use type hints for all functions and methods
  • Write docstrings in Google style
  • Keep functions under 50 lines when possible
  • Maintain test coverage above 90%

Issue Reports

When reporting issues, please include:

  • Python version and operating system
  • Complete error messages and stack traces
  • Steps to reproduce the issue
  • Expected vs actual behavior
  • Log files if applicable

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • OpenAI for the Whisper and GPT models
  • Textual for the excellent TUI framework
  • Python Community for the amazing ecosystem
  • All contributors who have helped improve this project

Support

Made with ❤️ by VinnyVanGogh
Transforming voice to text with style and intelligence

⬆️ Back to Top

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts