Raggo - Retrieval Augmented Generation Library
A flexible RAG (Retrieval Augmented Generation) library for Go, designed to make document processing and context-aware AI interactions simple and efficient.
🔍 Smart Document Search • 💬 Context-Aware Responses • 🤖 Intelligent RAG

Quick Start
package main
import (
"context"
"fmt"
"github.com/teilomillet/raggo"
)
func main() {
rag, err := raggo.NewSimpleRAG(raggo.DefaultConfig())
if err != nil {
fmt.Printf("Error: %v\n", err)
return
}
defer rag.Close()
err = rag.AddDocuments(context.Background(), "./docs")
if err != nil {
fmt.Printf("Error: %v\n", err)
return
}
response, _ := rag.Search(context.Background(), "What are the key features?")
fmt.Printf("Answer: %s\n", response)
}
Configuration
Raggo provides a flexible configuration system that can be loaded from multiple sources (environment variables, JSON files, or programmatic defaults):
cfg, err := config.LoadConfig()
if err != nil {
log.Fatal(err)
}
cfg := &config.Config{
Provider: "milvus",
Model: "text-embedding-3-small",
Collection: "my_documents",
DefaultTopK: 5,
DefaultMinScore: 0.7,
DefaultChunkSize: 300,
DefaultChunkOverlap: 50,
}
rag, err := raggo.NewSimpleRAG(cfg)
Configuration can be saved for reuse:
err := cfg.Save("~/.raggo/config.json")
Environment variables (take precedence over config files):
RAGGO_PROVIDER
: Service provider
RAGGO_MODEL
: Model identifier
RAGGO_COLLECTION
: Collection name
RAGGO_API_KEY
: Default API key
Table of Contents
Part 1: Core Components
Part 2: RAG Implementations
Part 1: Core Components
Quick Start
Prerequisites
export OPENAI_API_KEY=your-api-key
go get github.com/teilomillet/raggo
Building Blocks
Document Loading
loader := raggo.NewLoader(raggo.SetTimeout(1*time.Minute))
doc, err := loader.LoadURL(context.Background(), "https://example.com/doc.pdf")
Text Parsing
parser := raggo.NewParser()
doc, err := parser.Parse("document.pdf")
Text Chunking
chunker := raggo.NewChunker(raggo.ChunkSize(100))
chunks := chunker.Chunk(doc.Content)
Embeddings
embedder := raggo.NewEmbedder(
raggo.SetProvider("openai"),
raggo.SetModel("text-embedding-3-small"),
)
Vector Storage
db := raggo.NewVectorDB(raggo.WithMilvus("collection"))
Part 2: RAG Implementations
Simple RAG
Best for straightforward document Q&A:
package main
import (
"context"
"log"
"github.com/teilomillet/raggo"
)
func main() {
rag, err := raggo.NewSimpleRAG(raggo.SimpleRAGConfig{
Collection: "docs",
Model: "text-embedding-3-small",
ChunkSize: 300,
TopK: 3,
})
if err != nil {
log.Fatal(err)
}
defer rag.Close()
err = rag.AddDocuments(context.Background(), "./documents")
if err != nil {
log.Fatal(err)
}
basicResponse, _ := rag.Search(context.Background(), "What is the main feature?")
hybridResponse, _ := rag.SearchHybrid(context.Background(), "How does it work?", 0.7)
log.Printf("Basic Search: %s\n", basicResponse)
log.Printf("Hybrid Search: %s\n", hybridResponse)
}
Contextual RAG
For complex document understanding and context-aware responses:
package main
import (
"context"
"fmt"
"os"
"path/filepath"
"github.com/teilomillet/raggo"
)
func main() {
rag, err := raggo.NewDefaultContextualRAG("basic_contextual_docs")
if err != nil {
fmt.Printf("Failed to initialize RAG: %v\n", err)
os.Exit(1)
}
defer rag.Close()
docsPath := filepath.Join("examples", "docs")
if err := rag.AddDocuments(context.Background(), docsPath); err != nil {
fmt.Printf("Failed to add documents: %v\n", err)
os.Exit(1)
}
query := "What are the key features of the product?"
response, err := rag.Search(context.Background(), query)
if err != nil {
fmt.Printf("Failed to search: %v\n", err)
os.Exit(1)
}
fmt.Printf("\nQuery: %s\nResponse: %s\n", query, response)
}
Advanced Configuration
config := &raggo.ContextualRAGConfig{
Collection: "advanced_contextual_docs",
Model: "text-embedding-3-small",
LLMModel: "gpt-4o-mini",
ChunkSize: 300,
ChunkOverlap: 75,
TopK: 5,
MinScore: 0.7,
}
rag, err := raggo.NewContextualRAG(config)
if err != nil {
log.Fatalf("Failed to initialize RAG: %v", err)
}
defer rag.Close()
Memory Context
For chat applications and long-term context retention:
package main
import (
"context"
"log"
"github.com/teilomillet/raggo"
"github.com/teilomillet/gollm"
)
func main() {
memoryCtx, err := raggo.NewMemoryContext(
os.Getenv("OPENAI_API_KEY"),
raggo.MemoryTopK(5),
raggo.MemoryCollection("chat"),
raggo.MemoryStoreLastN(100),
raggo.MemoryMinScore(0.7),
)
if err != nil {
log.Fatal(err)
}
defer memoryCtx.Close()
rag, err := raggo.NewContextualRAG(&raggo.ContextualRAGConfig{
Collection: "docs",
Model: "text-embedding-3-small",
})
if err != nil {
log.Fatal(err)
}
defer rag.Close()
messages := []gollm.MemoryMessage{
{Role: "user", Content: "How does the authentication system work?"},
}
err = memoryCtx.StoreMemory(context.Background(), messages)
if err != nil {
log.Fatal(err)
}
prompt := &gollm.Prompt{Messages: messages}
enhanced, _ := memoryCtx.EnhancePrompt(context.Background(), prompt, messages)
response, _ := rag.Search(context.Background(), enhanced.Messages[0].Content)
log.Printf("Response: %s\n", response)
}
Advanced Use Cases
Full Processing Pipeline
Process large document sets with rate limiting and concurrent processing:
package main
import (
"context"
"log"
"sync"
"time"
"github.com/teilomillet/raggo"
"golang.org/x/time/rate"
)
const (
GPT_RPM_LIMIT = 5000
GPT_TPM_LIMIT = 4000000
MAX_CONCURRENT = 10
)
func main() {
parser := raggo.NewParser()
chunker := raggo.NewChunker(raggo.ChunkSize(500))
embedder := raggo.NewEmbedder(
raggo.SetProvider("openai"),
raggo.SetModel("text-embedding-3-small"),
)
limiter := rate.NewLimiter(rate.Limit(GPT_RPM_LIMIT/60), GPT_RPM_LIMIT)
var wg sync.WaitGroup
semaphore := make(chan struct{}, MAX_CONCURRENT)
files, _ := filepath.Glob("./documents/*.pdf")
for _, file := range files {
wg.Add(1)
semaphore <- struct{}{}
go func(file string) {
defer wg.Done()
defer func() { <-semaphore }()
limiter.Wait(context.Background())
doc, _ := parser.Parse(file)
chunks := chunker.Chunk(doc.Content)
embeddings, _ := embedder.CreateEmbeddings(chunks)
log.Printf("Processed %s: %d chunks\n", file, len(chunks))
}(file)
}
wg.Wait()
}
Best Practices
Resource Management
- Always use
defer Close()
- Monitor memory usage
- Clean up old data
Performance
- Use concurrent processing for large datasets
- Configure appropriate chunk sizes
- Enable hybrid search when needed
Context Management
- Use Memory Context for chat applications
- Configure context window size
- Clean up old memories periodically
Examples
Check /examples
for more:
- Basic usage:
/examples/simple/
- Context-aware:
/examples/contextual/
- Chat applications:
/examples/chat/
- Memory usage:
/examples/memory_enhancer_example.go
- Full pipeline:
/examples/full_process.go
- Benchmarks:
/examples/process_embedding_benchmark.go
License
MIT License - see LICENSE file