🚀 Big News: Socket Acquires Coana to Bring Reachability Analysis to Every Appsec Team.Learn more

karhu

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

karhu

An AI assistant with PDF processing, web browsing, and speech capabilities

2.0.0
Maintainers
1

Karhu AI Assistant CLI

Karhu is a powerful command-line AI assistant designed for productivity, research, and creative tasks. It supports file and document processing, web browsing, contextual conversations, speech synthesis, and advanced profile/model management—all from your terminal.

Table of Contents

Features

  • File & Document Processing: Read and process PDF, text, and other files.
  • Web Browsing & Search: Browse web pages and perform web searches directly from the CLI.
  • Contextual Conversations: Maintain, save, and manage conversation context for seamless multi-turn interactions.
  • Profile & Model Management: Switch between AI models and conversational profiles (e.g., coding, creative, academic, therapist).
  • Speech Synthesis & Recognition: Text-to-speech (TTS) and speech-to-text (STT) support, including multiple voice engines.
  • Interactive Mode: Chat with Karhu in a conversational loop with command autocompletion and history.
  • Robust Error Handling: Graceful error messages and recovery for all operations.
  • Extensible: Modular design for easy addition of new features and integrations.

Installation

  • Clone the repository:

    git clone https://github.com/yourusername/karhu-cli.git
    cd karhu-cli
    
  • Install dependencies:

    pip install -r requirements.txt
    

    (If using a virtual environment, activate it first.)

  • (Optional) Install extra system dependencies for TTS/STT features (see Speech and Voice Features).

Configuration

Karhu uses JSON configuration files in src/karhu/config/:

  • models.json: Define available AI models and their parameters.
  • profiles.json: Define conversational profiles (e.g., coding, creative, therapist).
  • system_prompt.json: Set the default system prompt for the assistant.

You can customize or add new profiles and models by editing these files.

Usage

CLI Options

Run Karhu from the project root:

python src/karhu/cli.py [OPTIONS]

Main options:

  • --query, -q <question>: Ask a direct question.
  • --interactive, -i: Start interactive chat mode.
  • --file, -f <path>: Process a specific file.
  • --files, -ff <directory>: Process all files in a directory.
  • --web, -w <url>: Browse a web page.
  • --search, -s <query>: Perform a web search.
  • --model, -m <name>: Select AI model.
  • --profile, -P <name>: Select conversational profile.
  • --setsprompt <prompt>: Set a custom system prompt.
  • --save: Save conversation context.
  • --clear, -c: Clear current context.
  • --list-models: List available models.
  • --list-profiles: List available profiles.
  • --voices: List TTS voices.
  • --kokoro-voices: List Kokoro TTS voices.
  • --kokoro-blend <indices>: Blend Kokoro voices.
  • --help-commands: Show all available commands.

Interactive Mode

Start with:

python src/karhu/cli.py --interactive

Features:

  • Command autocompletion and history.
  • All CLI and special commands available as !command (see below).

Interactive Commands

  • !model [name] — Switch AI model.
  • !list_models — List models.
  • !profile [name] — Switch profile.
  • !list_profiles — List profiles.
  • !create_profile [name:prompt] — Create a new profile.
  • !system_prompt — Show current system prompt.
  • !setsprompt [prompt] — Set system prompt.
  • !file [path] — Read a file.
  • !files [directory] — Read all files in a directory.
  • !browse [url] — Browse a web page.
  • !search [query] — Web search.
  • !context_size — Show context size.
  • !context_info — Show context details.
  • !optimize_context — Summarize/optimize context.
  • !search_context [query] — Search within context.
  • !chunk [id] — List/retrieve document chunks.
  • !save — Save conversation.
  • !clear — Clear context.
  • !clearall — Clear all context/history.
  • !lazy — Toggle speech-to-text mode.
  • !speak — Toggle text-to-speech mode.
  • !voices — List TTS voices.
  • !voice [index] — Change TTS voice.
  • !kokoro — Toggle Kokoro TTS.
  • !kokoro_voices — List Kokoro voices.
  • !kokoro_voice [index] — Change Kokoro voice.
  • !kokoro_blend [indices] — Blend Kokoro voices.
  • !help — Show help.
  • !quit — Exit.

Example Commands

  • Process a PDF:
    python src/karhu/cli.py --file path/to/file.pdf
    
  • Web search:
    python src/karhu/cli.py --search "What is quantum computing?"
    
  • Switch to therapist profile in interactive mode:
    python src/karhu/cli.py --interactive --profile therapist
    

Profiles and Models

Karhu supports multiple AI models (e.g., GPT-4o, Claude, Gemma) and conversational profiles (e.g., coding, creative, academic, therapist, funny, sarcastic, chill). You can switch or create new ones at runtime.

  • List models: !list_models
  • Switch model: !model <name>
  • List profiles: !list_profiles
  • Switch profile: !profile <name>
  • Create profile: !create_profile name:prompt

Profiles are defined in src/karhu/config/profiles.json.

Speech and Voice Features

  • Text-to-Speech (TTS): Use !speak, !voices, !voice [index] to enable and select voices.
  • Kokoro TTS: Advanced TTS engine with voice blending (!kokoro, !kokoro_voices, !kokoro_voice, !kokoro_blend).
  • Speech-to-Text (STT): Use !lazy to toggle speech input mode.

Note: Some features may require additional system dependencies (e.g., espeak, ffmpeg, or platform-specific TTS engines).

Context Management

  • Save context: !save
  • Clear context: !clear
  • Clear all: !clearall
  • Show context size/info: !context_size, !context_info
  • Optimize context: !optimize_context
  • Search context: !search_context [query]
  • Chunking: !chunk [id] for large documents

Module Reference

  • ai_assistant.py: Core assistant logic and LLM interaction.
  • cli.py: Command-line interface and argument parsing.
  • interactive.py: Interactive chat mode.
  • model_manager.py: Model selection and management.
  • profile_manager.py: Profile selection and management.
  • context_manager.py: Context storage, retrieval, and optimization.
  • document_processor.py: File and document parsing.
  • web_browser.py: Web browsing and search.
  • TextToSpeech.py / SpeechToText.py / kokorotts.py: Speech synthesis and recognition.
  • Display_help.py: Command help and documentation.
  • Errors.py: Error handling and reporting.
  • config_parser.py: Configuration file parsing.
  • globals.py: Global state and settings.

Testing

Run all tests with:

pytest

Tests are located in the tests/ directory and cover core modules and features.

Contributing

  • Fork the repository and create a new branch.
  • Add your feature or fix.
  • Write or update tests as needed.
  • Submit a pull request with a clear description.

License

This project is licensed under the MIT License.

For questions or support, please open an issue on GitHub.

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts