
Company News
Andrew Becherer Joins Socket as Chief Information Security Officer
Socket’s first CISO brings deep experience securing high-growth SaaS companies as open source supply chain threats accelerate.
@devpuccino/mcp-git-codebase
Advanced tools
MCP server providing semantic code search and indexing for git repositories
An MCP (Model Context Protocol) server that provides semantic code search and intelligent indexing for git repositories. Enables AI-powered semantic search across codebases using vector embeddings to find relevant code snippets by intent, not just keywords.
✨ Semantic Search - Find code by meaning, not just keywords
🔍 Multi-Language Support - TypeScript, JavaScript, Python, Go, Java, Rust, and more
📊 Multiple Vector Databases - Qdrant, Pinecone, Chroma, Milvus, PostgreSQL with pgvector
🚀 Scalable Indexing - Handle repositories with 1M+ files and 100GB+ of code
⚙️ Background Processing - Queue indexing jobs via Redis/Bull
🌿 Branch-Aware - Search across specific branches or track changes over time
🎯 Precise Code Retrieval - Get exact code snippets with line-level precision.
npm install @devpuccino/mcp-git-codebase
Set your preferred vector database and its connection details:
# Qdrant (recommended for local development)
export VECTOR_DB_PROVIDER=qdrant
export QDRANT_URL=http://localhost:6333
# Or Pinecone
export VECTOR_DB_PROVIDER=pinecone
export PINECONE_API_KEY=your-api-key
export PINECONE_ENVIRONMENT=your-environment
export PINECONE_INDEX=your-index
# Or PostgreSQL with pgvector
export VECTOR_DB_PROVIDER=postgres
export DATABASE_URL=postgresql://user:password@localhost:5432/codebase
# Or Chroma
export VECTOR_DB_PROVIDER=chroma
export CHROMA_URL=http://localhost
export CHROMA_PORT=8000
# Or Milvus
export VECTOR_DB_PROVIDER=milvus
export MILVUS_HOST=localhost
export MILVUS_PORT=19530
# Ollama (default, local)
export EMBEDDING_PROVIDER=ollama
export OLLAMA_BASE_URL=http://localhost:11434
export OLLAMA_EMBEDDING_MODEL=bge-base-en-v1.5
# Or OpenAI (cloud)
export EMBEDDING_PROVIDER=openai
export OPENAI_API_KEY=sk-...
export OPENAI_EMBEDDING_MODEL=text-embedding-3-small
Add to your Claude Code configuration (settings.json or settings.local.json):
Minimal Configuration (Qdrant + Ollama):
{
"mcpServers": {
"mcp-git-codebase": {
"command": "npx",
"args": ["@devpuccino/mcp-git-codebase"],
"env": {
"VECTOR_DB_PROVIDER": "qdrant",
"QDRANT_URL": "http://localhost:6333",
"EMBEDDING_PROVIDER": "ollama",
"OLLAMA_BASE_URL": "http://localhost:11434",
"OLLAMA_EMBEDDING_MODEL": "nomic-embed-text"
}
}
}
}
Full Configuration Example:
{
"mcpServers": {
"git-codebase": {
"command": "npx",
"args": ["--legacy-peer-deps", "@devpuccino/mcp-git-codebase"],
"env": {
"VECTOR_DB_PROVIDER": "qdrant",
"QDRANT_URL": "http://your-qdrant-host:6333",
"QDRANT_API_KEY": "your-api-key-if-needed",
"VECTOR_DB_COLLECTION_PREFIX": "codebase_",
"EMBEDDING_PROVIDER": "ollama",
"OLLAMA_BASE_URL": "http://your-ollama-host:11434",
"OLLAMA_EMBEDDING_MODEL": "bge-base-en-v1.5",
"EMBEDDING_TIMEOUT": "30000",
"LLM_PROVIDER": "ollama",
"OLLAMA_MODEL": "qwen2.5-coder:7b",
"OLLAMA_TIMEOUT": "30000",
"OLLAMA_MAX_RETRIES": "3",
"INDEXING_LLM_ENABLED": "true",
"REDIS_HOST": "your-redis-host",
"REDIS_PORT": "6379",
"REDIS_PASSWORD": "your-redis-password",
"REDIS_DB": "0",
"ENABLE_RERANKING": "true",
"RERANKER_TYPE": "bm25",
"CONSUMER_CONCURRENCY": "2",
"STARTUP_BATCH_ENABLED": "true",
"STARTUP_BATCH_LIMIT": "50",
"LOG_LEVEL": "info"
}
}
}
}
Production Configuration (Pinecone + OpenAI):
{
"mcpServers": {
"mcp-git-codebase": {
"command": "npx",
"args": ["@devpuccino/mcp-git-codebase"],
"env": {
"VECTOR_DB_PROVIDER": "pinecone",
"PINECONE_API_KEY": "your-pinecone-api-key",
"PINECONE_ENVIRONMENT": "us-east-1",
"PINECONE_INDEX": "your-index-name",
"EMBEDDING_PROVIDER": "openai",
"OPENAI_API_KEY": "your-openai-api-key",
"OPENAI_EMBEDDING_MODEL": "text-embedding-3-small",
"LLM_PROVIDER": "openai",
"OPENAI_LLM_MODEL": "gpt-4o-mini",
"LOG_LEVEL": "warn"
}
}
}
}
query_codebasePerform semantic search across a git repository to find relevant code snippets by meaning.
Parameters:
query_sentence (required): Natural language search query or code snippetproject_path (required): Root directory of the git repositorybranch (optional): Specific branch to search (default: current branch)limit (optional): Max results to return, 1-20 (default: 5)similarity_threshold (optional): Minimum similarity score, 0-1 (default: 0.6)file_extensions (optional): Filter by file extensions (e.g., [".ts", ".tsx"])Example:
{
"query_sentence": "function to authenticate users with JWT tokens",
"project_path": "/workspace/myapp",
"limit": 5,
"file_extensions": [".ts", ".tsx"]
}
get_code_snippetRetrieve a specific code snippet from a file with line-level precision.
Parameters:
project_path (required): Root directory of the git repositoryfilepath (required): Relative path to the filestart_line (optional): Starting line number (1-indexed)end_line (optional): Ending line numberinclude_line_numbers (optional): Show line numbers (default: true)Example:
{
"project_path": "/workspace/myapp",
"filepath": "src/auth/index.ts",
"start_line": 10,
"end_line": 45,
"include_line_numbers": true
}
sync_codebaseIndex or re-index a git repository into the vector database.
Parameters:
project_path (required): Root directory of the git repositorybranch (optional): Branch to sync (default: current branch)file_extensions (optional): Only sync specific file typesbackground (optional): Queue as background job (default: false)force (optional): Force full re-index from scratch (default: false)Example:
{
"project_path": "/workspace/myapp",
"force": false,
"background": true
}
update_codebaseTrigger indexing after code changes. Optionally commits to git.
Parameters:
project_path (required): Root directory of the git repositorycommit_message (required): Message summarizing changeschanged_files (required): Array of changed files with change typetrigger_type (required): One of manual, post_generation, post_mergeskip_git_commit (optional): Skip git commit (default: false)background (optional): Queue as background job (default: false)Example:
{
"project_path": "/workspace/myapp",
"commit_message": "Update authentication module",
"changed_files": [
{ "path": "src/auth/index.ts", "change_type": "modified" },
{ "path": "src/auth/jwt.ts", "change_type": "added" }
],
"trigger_type": "manual",
"background": false
}
| Variable | Default | Description |
|---|---|---|
VECTOR_DB_PROVIDER | qdrant | Vector database type: qdrant, pinecone, chroma, milvus, postgres |
EMBEDDING_DIMENSION | 1536 | Dimension of embedding vectors (auto-detected from model, rarely needed) |
VECTOR_DB_COLLECTION_PREFIX | - | Optional prefix for collection names (useful for multi-tenant setups) |
| Variable | Default | Description |
|---|---|---|
QDRANT_URL | http://localhost:6333 | Qdrant server URL |
QDRANT_API_KEY | - | Qdrant API key (for cloud/managed instances) |
QDRANT_COLLECTION | code_snippets | Collection name for storing embeddings |
| Variable | Default | Description |
|---|---|---|
PINECONE_API_KEY | - | Pinecone API key (required) |
PINECONE_ENVIRONMENT | - | Pinecone environment/region (required) |
PINECONE_INDEX | code-snippets | Pinecone index name |
| Variable | Default | Description |
|---|---|---|
CHROMA_URL | http://localhost | Chroma server URL |
CHROMA_PORT | 8000 | Chroma server port |
CHROMA_COLLECTION | code_snippets | Collection name for storing embeddings |
| Variable | Default | Description |
|---|---|---|
MILVUS_HOST | localhost | Milvus server host |
MILVUS_PORT | 19530 | Milvus server port |
MILVUS_COLLECTION | code_snippets | Collection name for storing embeddings |
| Variable | Default | Description |
|---|---|---|
DATABASE_URL | - | PostgreSQL connection string (required) |
POSTGRES_VECTOR_TABLE | code_snippets_vectors | Table name for storing vectors |
POSTGRES_EMBEDDING_COLUMN | embedding | Column name for embedding vectors |
| Variable | Default | Description |
|---|---|---|
EMBEDDING_PROVIDER | ollama | Embedding provider: openai, ollama |
EMBEDDING_DIMENSION | 1536 | Dimension of embedding vectors (auto-detected from model if not set) |
EMBEDDING_TIMEOUT | 30000 | Timeout for embedding API requests (milliseconds) |
EMBEDDING_BATCH_SIZE | 10 | Number of items to embed per batch |
EMBEDDING_MAX_RETRIES | 3 | Maximum retry attempts for failed embedding requests |
| Variable | Default | Description |
|---|---|---|
OPENAI_API_KEY | - | OpenAI API key (required for OpenAI provider) |
OPENAI_EMBEDDING_MODEL | text-embedding-3-small | OpenAI embedding model to use |
OPENAI_BASE_URL | https://api.openai.com | OpenAI API base URL (for custom endpoints) |
| Variable | Default | Description |
|---|---|---|
OLLAMA_BASE_URL | http://localhost:11434 | Ollama server URL |
OLLAMA_EMBEDDING_MODEL | bge-base-en-v1.5 | Ollama embedding model to use |
Common Ollama embedding models:
bge-base-en-v1.5 (768 dimensions) - default, good balancebge-large-en-v1.5 (1024 dimensions) - higher qualitynomic-embed-text (768 dimensions) - fast and efficientmxbai-embed-large (1024 dimensions) - high quality| Variable | Default | Description |
|---|---|---|
LLM_PROVIDER | ollama | LLM provider for code analysis: openai, ollama |
LLM_TIMEOUT | 8000 | Timeout for LLM API requests (milliseconds) |
LLM_MAX_RETRIES | 2 | Maximum retry attempts for failed LLM requests |
INDEXING_LLM_ENABLED | true | Enable LLM-based metadata generation during indexing |
| Variable | Default | Description |
|---|---|---|
OPENAI_LLM_MODEL | gpt-4o-mini | OpenAI model for code analysis and summaries |
| Variable | Default | Description |
|---|---|---|
OLLAMA_MODEL | qwen2.5-coder:7b | Ollama model for code analysis and summaries |
OLLAMA_TIMEOUT | 30000 | Timeout for Ollama API requests (milliseconds) |
OLLAMA_MAX_RETRIES | 3 | Maximum retry attempts for failed Ollama requests |
Common Ollama LLM models:
qwen2.5-coder:7b - default, excellent for code analysismistral - fast and capable, good for quick tasksllama3 - Meta's Llama 3, general purposecodellama - Meta's Code Llama, specialized for code generation| Variable | Default | Description |
|---|---|---|
ENABLE_RERANKING | false | Enable composite reranking for improved search results |
RERANKER_TYPE | bm25 | Reranker type: bm25 (keyword-based) or qwen3 (semantic) |
RERANK_API_URL | - | Reranker API endpoint (required if RERANKER_TYPE=qwen3) |
RERANK_TIMEOUT_MS | 5000 | Request timeout in milliseconds |
Reranker Types:
bm25 - Keyword-based reranking (fast, no external API needed)qwen3 - Semantic reranking using Qwen3 model (requires RERANK_API_URL)| Variable | Default | Description |
|---|---|---|
REDIS_URL | - | Full Redis connection URL (e.g., redis://localhost:6379). Takes precedence over individual settings |
REDIS_HOST | localhost | Redis server host (used if REDIS_URL not set) |
REDIS_PORT | 6379 | Redis server port (used if REDIS_URL not set) |
REDIS_PASSWORD | - | Redis password for authentication (optional) |
REDIS_DB | 0 | Redis database number (0-15) |
| Variable | Default | Description |
|---|---|---|
CONSUMER_CONCURRENCY | 1 | Number of concurrent jobs to process |
PROCESSING_TIMEOUT | 300000 | Job timeout in milliseconds (default: 5 minutes) |
STARTUP_BATCH_ENABLED | true | Enable batch processing of queued jobs on startup |
STARTUP_BATCH_LIMIT | 50 | Maximum jobs to process in startup batch |
Note: Background processing requires a running Redis server. Use REDIS_URL for simple setups or individual settings (REDIS_HOST, REDIS_PORT, REDIS_PASSWORD, REDIS_DB) for more control.
| Variable | Default | Description |
|---|---|---|
LOG_LEVEL | info | Log level: debug, info, warn, error |
LOG_FORMAT | json | Log format: json or text |
┌─────────────────────────────────────────────────────────┐
│ Claude Code / MCP Client │
└──────────────────────┬──────────────────────────────────┘
│ MCP Protocol (JSON-RPC)
│
┌──────────────────────▼──────────────────────────────────┐
│ MCP Git Codebase Server │
│ ┌─────────────┬──────────────┬────────────────────┐ │
│ │ Tools │ Indexing │ Background Jobs │ │
│ │ (4 tools) │ Pipeline │ (Bull + Redis) │ │
│ └─────────────┴──────────────┴────────────────────┘ │
└──────────────────────┬──────────────────────────────────┘
│
┌──────────────┼──────────────┐
│ │ │
▼ ▼ ▼
Git Repo Vector Database Embedding Model
(local) (Qdrant/etc) (OpenAI/Ollama)
Indexing Pipeline
Query Pipeline
Background Processing
# Start Qdrant (requires Docker)
docker run -p 6333:6333 qdrant/qdrant
# Set environment variables
export VECTOR_DB_PROVIDER=qdrant
export QDRANT_URL=http://localhost:6333
export EMBEDDING_PROVIDER=ollama
export OLLAMA_BASE_URL=http://localhost:11434
export OLLAMA_EMBEDDING_MODEL=nomic-embed-text
# Start server
npx @devpuccino/mcp-git-codebase
# Set environment variables
export VECTOR_DB_PROVIDER=pinecone
export PINECONE_API_KEY=your-production-key
export PINECONE_ENVIRONMENT=your-environment
export PINECONE_INDEX=prod-codebase
export EMBEDDING_PROVIDER=openai
export OPENAI_API_KEY=sk-...
export OPENAI_EMBEDDING_MODEL=text-embedding-3-small
# Start server
npx @devpuccino/mcp-git-codebase
# Set environment variables
export VECTOR_DB_PROVIDER=postgres
export DATABASE_URL=postgresql://user:password@host:5432/codebase_db
export EMBEDDING_PROVIDER=openai
export OPENAI_API_KEY=sk-...
export OPENAI_EMBEDDING_MODEL=text-embedding-3-small
# Start server
npx @devpuccino/mcp-git-codebase
Optimization Tips:
background=true for large codebasesCONSUMER_CONCURRENCY based on resourcesupdate_codebasefile_extensions to reduce scopesimilarity_threshold if too many results# Verify vector database is running
curl http://localhost:6333/health # Qdrant
curl http://localhost:8000/api/v1/heartbeat # Chroma
# Check logs
export LOG_LEVEL=debug
npx @devpuccino/mcp-git-codebase
# Verify Ollama is running
curl http://localhost:11434/api/tags
# Or verify OpenAI API key
echo $OPENAI_API_KEY
CONSUMER_CONCURRENCYbackground=true for large syncs# Clone repository
git clone https://github.com/devpuccino/mcp-git-codebase.git
cd mcp-git-codebase
# Install dependencies
npm install
# Build
npm run build
# Run tests
npm test
# Start in development mode
npm run dev
MIT
For issues, questions, or feature requests, please visit the GitHub repository.
FAQs
MCP server providing semantic code search and indexing for git repositories
We found that @devpuccino/mcp-git-codebase demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Company News
Socket’s first CISO brings deep experience securing high-growth SaaS companies as open source supply chain threats accelerate.

Company News
Replit is integrating Socket Firewall into its AI-powered development experience to help protect builders from malicious open source packages.

Security News
npm confirmed a tooling bug incorrectly marked several one-character packages as security holders and said it was working on a rollback.