# LightRAG MCP Integration **Status:** ✅ Production Ready **Version:** 1.0.0 **Date:** 2025-09-30 ## Overview CHORUS now includes optional LightRAG integration for Retrieval-Augmented Generation (RAG) capabilities. LightRAG provides graph-based knowledge retrieval to enrich AI reasoning and context resolution. ## Architecture ### Components 1. **LightRAG Client** (`pkg/mcp/lightrag_client.go`) - HTTP client for LightRAG MCP server - Supports 4 query modes: naive, local, global, hybrid - Health checking and document insertion - Configurable timeouts and API authentication 2. **Reasoning Engine Integration** (`reasoning/reasoning.go`) - `GenerateResponseWithRAG()` - RAG-enriched response generation - `GenerateResponseSmartWithRAG()` - Combines model selection + RAG - `SetLightRAGClient()` - Configure RAG client - Non-fatal error handling (degrades gracefully) 3. **SLURP Context Enrichment** (`pkg/slurp/context/lightrag.go`) - `LightRAGEnricher` - Enriches context nodes with RAG data - `EnrichContextNode()` - Add insights to individual nodes - `EnrichResolvedContext()` - Enrich resolved context chains - `InsertContextNode()` - Build knowledge base over time 4. **Configuration** (`pkg/config/config.go`) - `LightRAGConfig` struct with 5 configuration options - Environment variable support - Automatic initialization in runtime ## Configuration ### Environment Variables ```bash # Enable LightRAG integration CHORUS_LIGHTRAG_ENABLED=true # LightRAG server endpoint CHORUS_LIGHTRAG_BASE_URL=http://127.0.0.1:9621 # Query timeout CHORUS_LIGHTRAG_TIMEOUT=30s # Optional API key CHORUS_LIGHTRAG_API_KEY=your-api-key # Default query mode (naive, local, global, hybrid) CHORUS_LIGHTRAG_DEFAULT_MODE=hybrid ``` ### Docker Configuration ```yaml services: chorus-agent: environment: - CHORUS_LIGHTRAG_ENABLED=true - CHORUS_LIGHTRAG_BASE_URL=http://lightrag:9621 - CHORUS_LIGHTRAG_DEFAULT_MODE=hybrid depends_on: - lightrag lightrag: image: lightrag/lightrag:latest ports: - "9621:9621" volumes: - lightrag-data:/app/data ``` ## Query Modes LightRAG supports 4 query modes with different retrieval strategies: 1. **Naive Mode** (`QueryModeNaive`) - Simple semantic search - Fastest, least context - Use for: Quick lookups 2. **Local Mode** (`QueryModeLocal`) - Local graph traversal - Context from immediate neighbors - Use for: Related information 3. **Global Mode** (`QueryModeGlobal`) - Global graph analysis - Broad context from entire knowledge base - Use for: High-level questions 4. **Hybrid Mode** (`QueryModeHybrid`) ⭐ **Recommended** - Combined approach - Balances breadth and depth - Use for: General purpose RAG ## Usage Examples ### Reasoning Engine with RAG ```go import ( "context" "chorus/reasoning" "chorus/pkg/mcp" ) // Initialize LightRAG client config := mcp.LightRAGConfig{ BaseURL: "http://127.0.0.1:9621", Timeout: 30 * time.Second, } client := mcp.NewLightRAGClient(config) // Configure reasoning engine reasoning.SetLightRAGClient(client) // Generate RAG-enriched response ctx := context.Background() response, err := reasoning.GenerateResponseWithRAG( ctx, "meta/llama-3.1-8b-instruct", "How does CHORUS handle P2P networking?", mcp.QueryModeHybrid, ) ``` ### SLURP Context Enrichment ```go import ( "context" "chorus/pkg/slurp/context" "chorus/pkg/mcp" ) // Create enricher enricher := context.NewLightRAGEnricher(client, "hybrid") // Enrich a context node node := &context.ContextNode{ Path: "/pkg/p2p", Summary: "P2P networking implementation", Purpose: "Provides libp2p networking layer", } err := enricher.EnrichContextNode(ctx, node) // node.Insights now contains RAG-retrieved information // Insert for future retrieval err = enricher.InsertContextNode(ctx, node) ``` ### Direct LightRAG Client ```go import ( "context" "chorus/pkg/mcp" ) client := mcp.NewLightRAGClient(config) // Health check healthy := client.IsHealthy(ctx) // Query with response response, err := client.Query(ctx, "query", mcp.QueryModeHybrid) // Get context only context, err := client.GetContext(ctx, "query", mcp.QueryModeHybrid) // Insert document err := client.Insert(ctx, "text content", "description") ``` ## Integration Points ### Runtime Initialization LightRAG is initialized automatically in `internal/runtime/shared.go`: ```go // Line 685-704 if cfg.LightRAG.Enabled { lightragConfig := mcp.LightRAGConfig{ BaseURL: cfg.LightRAG.BaseURL, Timeout: cfg.LightRAG.Timeout, APIKey: cfg.LightRAG.APIKey, } lightragClient := mcp.NewLightRAGClient(lightragConfig) if lightragClient.IsHealthy(ctx) { reasoning.SetLightRAGClient(lightragClient) logger.Info("📚 LightRAG RAG system enabled") } } ``` ### Graceful Degradation LightRAG integration is **completely optional** and **non-blocking**: - If `CHORUS_LIGHTRAG_ENABLED=false`, no LightRAG calls are made - If LightRAG server is unavailable, health check fails gracefully - If RAG queries fail, reasoning engine falls back to non-RAG generation - SLURP enrichment failures are logged but don't block context resolution ## Testing ### Unit Tests ```bash # Run all LightRAG tests (requires running server) go test -v ./pkg/mcp/ # Run only unit tests (no server required) go test -v -short ./pkg/mcp/ ``` ### Integration Tests ```bash # Start LightRAG server cd ~/chorus/mcp-include/LightRAG python main.py # Run integration tests cd ~/chorus/project-queues/active/CHORUS go test -v ./pkg/mcp/ -run TestLightRAGClient ``` ## Performance Considerations ### Query Timeouts - Default: 30 seconds - Hybrid mode is slowest (analyzes entire graph) - Naive mode is fastest (simple semantic search) ### Caching LightRAG includes internal caching: - Repeated queries return cached results - Cache TTL managed by LightRAG server - No CHORUS-side caching required ### Resource Usage - Memory: Proportional to knowledge base size - CPU: Query modes have different compute requirements - Network: HTTP requests to LightRAG server ## Troubleshooting ### Server Not Healthy **Symptom:** `LightRAG enabled but server not healthy` **Solutions:** 1. Check if LightRAG server is running: `curl http://127.0.0.1:9621/health` 2. Verify correct port in `CHORUS_LIGHTRAG_BASE_URL` 3. Check LightRAG logs for errors 4. Ensure network connectivity between CHORUS and LightRAG ### Empty Responses **Symptom:** RAG queries return empty results **Solutions:** 1. Knowledge base may be empty - insert documents first 2. Query may not match indexed content 3. Try different query mode (hybrid recommended) 4. Check LightRAG indexing logs ### Timeout Errors **Symptom:** `context deadline exceeded` **Solutions:** 1. Increase `CHORUS_LIGHTRAG_TIMEOUT` 2. Use faster query mode (naive or local) 3. Optimize LightRAG server performance 4. Check network latency ## Security Considerations ### API Authentication Optional API key support: ```bash CHORUS_LIGHTRAG_API_KEY=your-secret-key ``` Keys are sent as Bearer tokens in Authorization header. ### Network Security - Run LightRAG on internal network only - Use HTTPS for production deployments - Consider firewall rules to restrict access - LightRAG doesn't include built-in encryption ### Data Privacy - All queries and documents are stored in LightRAG - Consider what data is being indexed - Implement data retention policies - Use access control on LightRAG server ## Monitoring ### Health Checks ```go // Check LightRAG availability if client.IsHealthy(ctx) { // Server is healthy } // Get detailed health info health, err := client.Health(ctx) // Returns: Status, CoreVersion, APIVersion, etc. ``` ### Metrics Consider adding: - RAG query latency - Cache hit rates - Enrichment success/failure rates - Knowledge base size ## Future Enhancements Potential improvements: 1. **Batch Query Optimization** - Batch multiple RAG queries together - Reduce HTTP overhead 2. **Adaptive Query Mode Selection** - Automatically choose query mode based on question type - Learn from past query performance 3. **Knowledge Base Management** - Automated document insertion from SLURP contexts - Background indexing of code repositories - Scheduled knowledge base updates 4. **Advanced Caching** - CHORUS-side caching with TTL - Semantic cache (similar queries share cache) - Persistent cache across restarts 5. **Multi-tenant Support** - Per-agent knowledge bases - Role-based access to documents - Encrypted knowledge storage ## Files Changed 1. `pkg/mcp/lightrag_client.go` - NEW (277 lines) 2. `pkg/mcp/lightrag_client_test.go` - NEW (239 lines) 3. `pkg/config/config.go` - Modified (added LightRAGConfig) 4. `reasoning/reasoning.go` - Modified (added RAG functions) 5. `internal/runtime/shared.go` - Modified (added initialization) 6. `pkg/slurp/context/lightrag.go` - NEW (203 lines) **Total:** 3 new files, 3 modified files, ~750 lines of code ## References - LightRAG Documentation: https://github.com/HKUDS/LightRAG - MCP Protocol Spec: https://spec.modelcontextprotocol.io - CHORUS Documentation: `docs/comprehensive/` --- **Maintainer:** CHORUS Project Team **Last Updated:** 2025-09-30 **Status:** Production Ready