
Security News
Attackers Are Hunting High-Impact Node.js Maintainers in a Coordinated Social Engineering Campaign
Multiple high-impact npm maintainers confirm they have been targeted in the same social engineering campaign that compromised Axios.
CLI agent that indexes local repos and answers questions with hosted or local LLMs.
A powerful CLI tool that ingests your codebase and allows you to ask questions about it using Retrieval-Augmented Generation (RAG).
Installation • Quick Start • Commands • Configuration • Examples
⚠️ Codebase Size Limitation: Codexa is optimized for small to medium-sized codebases. It currently supports projects with up to 200 files and 20,000 chunks. For larger codebases, consider using more restrictive
includeGlobspatterns to focus on specific directories or file types.
Before installing Codexa, ensure you have the following:
Node.js: v20.0.0 or higher
node --version # Should be v20.0.0 or higher
For Cloud LLM (Groq): A Groq API key from console.groq.com
Choose the installation method that works best for your system:
Install Codexa globally using npm:
npm install -g codexa
Verify installation:
codexa
Install codexa using Homebrew on macOS:
First, add the tap:
brew tap sahitya-chandra/codexa
Then install:
brew install codexa
To update codexa to the latest version:
If installed via npm:
npm install -g codexa@latest
If installed via Homebrew:
brew upgrade codexa
Check your current version:
codexa --version
Check for updates:
💡 Tip: It's recommended to keep Codexa updated to get the latest features, bug fixes, and security updates.
Codexa requires an LLM to generate answers. You can use Groq (cloud).
Groq provides fast cloud-based LLMs with a generous free tier.
Step 1: Get a Groq API Key
gsk_)Step 2: Set GROQ API Key
Run the following command to securely save your API key:
codexa config set GROQ_API_KEY "gsk_your_api_key_here"
This will save the key to your local configuration file (.codexarc.json).
Step 3: Verify API Key is Set
codexa config get GROQ_API_KEY
Step 4: Configure Codexa
Codexa defaults to using Groq when you run codexa init. If you need to manually configure, edit .codexarc.json:
{
"modelProvider": "groq",
"model": "openai/gpt-oss-120b",
"embeddingProvider": "local",
"embeddingModel": "Xenova/all-MiniLM-L6-v2"
}
Models you can use:
openai/gpt-oss-120b (recommended, default)llama-3.1-70b-versatileFor Groq:
# 1. Get API key from console.groq.com
# 2. Run codexa init (defaults to Groq)
codexa init
# 3. Set GROQ API key
codexa config set GROQ_API_KEY "gsk_your_key"
# 4. Proceed to igestion
Once Codexa is installed and your LLM is configured, you're ready to use it:
Navigate to your project directory:
cd /path/to/your/project
Initialize Codexa:
codexa init
This creates a .codexarc.json configuration file with sensible defaults.
Set GROQ API Key
codexa config set GROQ_API_KEY "gsk_your_key"
This will save the key to your local configuration file (.codexarc.json).
Ingest your codebase:
codexa ingest
This indexes your codebase and creates embeddings. First run may take a few minutes.
Ask questions:
codexa ask "How does the authentication flow work?"
codexa ask "What is the main entry point of this application?"
codexa ask "Show me how error handling is implemented"
initCreates a .codexarc.json configuration file optimized for your codebase.
codexa init
What it does:
.codexarc.json in the project root with tailored settingsDetection Capabilities:
Example Output:
Analyzing codebase...
✓ Detected: typescript, javascript (npm, yarn)
✓ Created .codexarc.json with optimized settings for your codebase!
┌ 🚀 Setup Complete ──────────────────────────────────────────┐
│ │
│ Next Steps: │
│ │
│ 1. Review .codexarc.json - Update provider keys if needed |
│ 2. Set your GROQ API Key: codexa config set GROQ_API_KEY |
│ 3. Run: codexa ingest - Start indexing your codebase │
│ 4. Run: codexa ask "your question" - Ask questions │
│ │
└─────────────────────────────────────────────────────────────┘
ingestIndexes the codebase and generates embeddings for semantic search.
codexa ingest [options]
Options:
-f, --force - Clear existing index and rebuild from scratchExamples:
# Standard ingestion
codexa ingest
# Force rebuild (useful if you've updated code significantly)
codexa ingest --force
What it does:
includeGlobs and excludeGlobs patterns.codexa/index.db (SQLite database)Smart Filtering:
Note: First ingestion may take a few minutes depending on your codebase size. Subsequent ingestions are faster as they only process changed files.
configManage configuration values, including API keys.
codexa config <action> [key] [value]
Actions:
set <key> <value> - Set a configuration valueget <key> - Get a configuration valuelist - List all configuration valuesExamples:
# Set Groq API Key
codexa config set GROQ_API_KEY "gsk_..."
# Check current key
codexa config get GROQ_API_KEY
askAsk natural language questions about your codebase.
codexa ask <question...> [options]
Arguments:
<question...> - Your question (can be multiple words)Options:
--stream - Enable streaming outputExamples:
# Basic question
codexa ask "How does user authentication work?"
# Question with multiple words
codexa ask "What is the main entry point of this application?"
# Enable streaming
codexa ask "Summarize the codebase structure" --stream
How it works:
Codexa uses a .codexarc.json file in your project root for configuration. This file is automatically created when you run codexa init.
Location: .codexarc.json (project root)
Format: JSON
When you run codexa init, Codexa automatically:
Analyzes your codebase structure to detect:
Generates optimized patterns:
Applies best practices:
dist/, build/, target/, etc.)node_modules/, vendor/, .venv/, etc.)This means your config is tailored to your project from the start, ensuring optimal indexing performance!
Some settings can be configured via environment variables:
| Variable | Description | Required For |
|---|---|---|
GROQ_API_KEY | Groq API key for cloud LLM | Groq provider |
Example:
# Using config command (Recommended)
codexa config set GROQ_API_KEY "gsk_your_key_here"
# Or using environment variables
export GROQ_API_KEY="gsk_your_key_here" # macOS/Linux
modelProviderType: "groq"
Default: "groq"
The LLM provider to use for generating answers.
"groq" - Uses Groq's cloud API (requires GROQ_API_KEY)modelType: string
Type: string
Default: "openai/gpt-oss-120b"
The model identifier to use.
embeddingProviderType: "local"
Default: "local"
The embedding provider for vector search.
"local" - Uses @xenova/transformers (runs entirely locally)embeddingModelType: string
Default: "Xenova/all-MiniLM-L6-v2"
The embedding model for generating vector representations. This model is downloaded automatically on first use.
maxChunkSizeType: number
Default: 200
Maximum number of lines per code chunk. Larger values = more context per chunk but fewer chunks.
chunkOverlapType: number
Default: 20
Number of lines to overlap between consecutive chunks. Helps maintain context at chunk boundaries.
includeGlobsType: string[]
Default: ["**/*.ts", "**/*.tsx", "**/*.js", "**/*.jsx", "**/*.py", "**/*.go", "**/*.rs", "**/*.java", "**/*.md", "**/*.json"]
File patterns to include in indexing. Supports glob patterns.
Examples:
{
"includeGlobs": [
"**/*.ts",
"**/*.tsx",
"src/**/*.js",
"lib/**/*.py"
]
}
excludeGlobsType: string[]
Default: ["node_modules/**", ".git/**", "dist/**", "build/**", ".codexa/**", "package-lock.json"]
File patterns to exclude from indexing.
Examples:
{
"excludeGlobs": [
"node_modules/**",
".git/**",
"dist/**",
"**/*.test.ts",
"coverage/**"
]
}
historyDirType: string
Default: ".codexa/sessions"
Directory to store conversation history for session management.
dbPathType: string
Default: ".codexa/index.db"
Path to the SQLite database storing code chunks and embeddings.
temperatureType: number
Default: 0.2
Controls randomness in LLM responses (0.0 = deterministic, 1.0 = creative).
topKType: number
Default: 4
Number of code chunks to retrieve and use as context for each question. Higher values provide more context but may include less relevant information.
maxFileSizeType: number
Default: 5242880 (5MB)
Maximum file size in bytes. Files larger than this will be excluded from indexing. Helps avoid processing large binary files or generated artifacts.
Example:
{
"maxFileSize": 10485760
}
skipBinaryFilesType: boolean
Default: true
Whether to automatically skip binary files during indexing. Binary detection uses both file extension and content analysis.
Example:
{
"skipBinaryFiles": true
}
skipLargeFilesType: boolean
Default: true
Whether to skip files exceeding maxFileSize during indexing. Set to false if you want to include all files regardless of size.
Example:
{
"skipLargeFiles": true,
"maxFileSize": 10485760
}
{
"modelProvider": "groq",
"model": "openai/gpt-oss-120b",
"embeddingProvider": "local",
"embeddingModel": "Xenova/all-MiniLM-L6-v2",
"maxChunkSize": 300,
"chunkOverlap": 20,
"temperature": 0.2,
"topK": 4
}
Remember: Set GROQ_API_KEY:
codexa config set GROQ_API_KEY "your-api-key"
{
"modelProvider": "groq",
"model": "openai/gpt-oss-120b",
"maxChunkSize": 150,
"chunkOverlap": 15,
"topK": 6,
"temperature": 0.1,
"includeGlobs": [
"src/**/*.ts",
"src/**/*.tsx",
"lib/**/*.ts"
],
"excludeGlobs": [
"node_modules/**",
"dist/**",
"**/*.test.ts",
"**/*.spec.ts",
"coverage/**"
]
}
# 1. Initialize in your project
cd my-project
codexa init
# 2. Set Groq Api Key
codexa config set GROQ_API_KEY <your-groq-key>
# 3. Index your codebase
codexa ingest
# 4. Ask questions
codexa ask "What is the main purpose of this codebase?"
codexa ask "How does the user authentication work?"
codexa ask "Where is the API routing configured?"
# After significant code changes
codexa ingest --force
Update .codexarc.json to focus on specific languages:
{
"includeGlobs": [
"**/*.ts",
"**/*.tsx"
],
"excludeGlobs": [
"node_modules/**",
"**/*.test.ts",
"**/*.spec.ts"
]
}
Codexa uses Retrieval-Augmented Generation (RAG) to answer questions about your codebase:
When you run codexa ingest:
includeGlobs/excludeGlobs).codexa/index.db)When you run codexa ask:
┌─────────────────┐
│ User Query │
└────────┬────────┘
│
▼
┌─────────────────┐ ┌──────────────┐
│ Embedding │────▶│ Vector │
│ Generation │ │ Search │
└─────────────────┘ └──────┬───────┘
│
▼
┌──────────────┐
│ Context │
│ Retrieval │
└──────┬───────┘
│
▼
┌─────────────────┐ ┌──────────────┐
│ SQLite DB │◀────│ LLM │
│ (Chunks + │ │ (Groq) │
│ Embeddings) │ │ │
└─────────────────┘ └──────┬───────┘
│
▼
┌──────────────┐
│ Answer │
└──────────────┘
Key Components:
Problem: Using Groq provider but API key is missing.
Solutions:
codexa config set GROQ_API_KEY "your-api-key"
export GROQ_API_KEY="your-api-key" # macOS/Linux
codexa config get GROQ_API_KEY
Problem: First ingestion takes too long.
Solutions:
.codexarc.json was generated correctlymaxFileSize to exclude more large filesmaxChunkSize to create more, smaller chunksexcludeGlobs to skip unnecessary filesincludeGlobs to focus on important files--force only when necessary (incremental updates are faster)skipBinaryFiles and skipLargeFiles are enabled (default)Problem: Answers are not relevant or accurate.
Solutions:
Increase topK to retrieve more context:
{
"topK": 6
}
Adjust temperature for more focused answers:
{
"temperature": 0.1
}
Re-index after significant code changes:
codexa ingest --force
Ask more specific questions
Problem: SQLite database is locked (multiple processes accessing it).
Solutions:
codexa process runs at a timedbPathProblem: Some files aren't being indexed.
Solutions:
includeGlobs patterns in .codexarc.jsonexcludeGlobs--force to rebuild:
codexa ingest --force
Q: Can I use Codexa with private/confidential code?
A: Yes! Codexa processes everything locally by default. Your code never leaves your machine unless you explicitly use cloud providers like Groq.
Q: How much disk space does Codexa use?
A: Typically 10-50MB per 1000 files, depending on file sizes. The SQLite database stores chunks and embeddings.
Q: Can I use Codexa in CI/CD?
A: Yes, but you'll need to ensure your LLM provider is accessible. For CI/CD, consider using Groq (cloud).
Q: Does Codexa work with monorepos?
A: Yes! Adjust includeGlobs and excludeGlobs to target specific packages or workspaces.
Q: Can I use multiple LLM providers?
A: You can switch providers by updating modelProvider in .codexarc.json. Each repository can have its own configuration.
Q: How often should I re-index?
A: Codexa only processes changed files on subsequent runs, so you can run ingest frequently. Use --force only when you need a complete rebuild.
Q: Is there a way to query the database directly?
A: The SQLite database (.codexa/index.db) can be queried directly, but the schema is internal. Use Codexa's commands for all operations.
Q: Can I customize the prompt sent to the LLM?
A: Currently, the prompt is fixed, but this may be configurable in future versions.
Contributions are welcome! Please see our Contributing Guide for details.
Quick start:
git checkout -b feature/amazing-feature)git commit -m 'Add some amazing feature')git push origin feature/amazing-feature)For major changes, please open an issue first to discuss what you would like to change.
See CONTRIBUTING.md for detailed guidelines.
This project is licensed under the MIT License - see the LICENSE file for details.
Made with ❤️ by the Codexa team
FAQs
CLI agent that indexes local repos and answers questions with hosted or local LLMs.
We found that codexa demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
Multiple high-impact npm maintainers confirm they have been targeted in the same social engineering campaign that compromised Axios.

Security News
Axios compromise traced to social engineering, showing how attacks on maintainers can bypass controls and expose the broader software supply chain.

Security News
Node.js has paused its bug bounty program after funding ended, removing payouts for vulnerability reports but keeping its security process unchanged.