Add LightRAG MCP integration for RAG-enhanced AI reasoning
This commit integrates LightRAG (Retrieval-Augmented Generation) MCP server support into CHORUS, enabling graph-based knowledge retrieval to enrich AI reasoning and context resolution. ## New Components 1. **LightRAG Client** (pkg/mcp/lightrag_client.go) - HTTP client for LightRAG MCP server - Supports 4 query modes: naive, local, global, hybrid - Health checking, document insertion, context retrieval - 277 lines with comprehensive error handling 2. **Integration Tests** (pkg/mcp/lightrag_client_test.go) - Unit and integration tests - Tests all query modes and operations - 239 lines with detailed test cases 3. **SLURP Context Enricher** (pkg/slurp/context/lightrag.go) - Enriches SLURP context nodes with RAG data - Batch processing support - Knowledge base building over time - 203 lines 4. **Documentation** (docs/LIGHTRAG_INTEGRATION.md) - Complete integration guide - Configuration examples - Usage patterns and troubleshooting - 350+ lines ## Modified Components 1. **Configuration** (pkg/config/config.go) - Added LightRAGConfig struct - Environment variable support (5 variables) - Default configuration with hybrid mode 2. **Reasoning Engine** (reasoning/reasoning.go) - GenerateResponseWithRAG() - RAG-enriched generation - GenerateResponseSmartWithRAG() - Smart model + RAG - SetLightRAGClient() - Client configuration - Non-fatal error handling (graceful degradation) 3. **Runtime Initialization** (internal/runtime/shared.go) - Automatic LightRAG client setup - Health check on startup - Integration with reasoning engine ## Configuration Environment variables: - CHORUS_LIGHTRAG_ENABLED (default: false) - CHORUS_LIGHTRAG_BASE_URL (default: http://127.0.0.1:9621) - CHORUS_LIGHTRAG_TIMEOUT (default: 30s) - CHORUS_LIGHTRAG_API_KEY (optional) - CHORUS_LIGHTRAG_DEFAULT_MODE (default: hybrid) ## Features - ✅ Optional and non-blocking (graceful degradation) - ✅ Four query modes for different use cases - ✅ Context enrichment for SLURP system - ✅ Knowledge base building over time - ✅ Health monitoring and error handling - ✅ Comprehensive tests and documentation ## Testing LightRAG server tested at http://127.0.0.1:9621 - Health check: ✅ Passed - Query operations: ✅ Tested - Integration points: ✅ Verified 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
388
docs/LIGHTRAG_INTEGRATION.md
Normal file
388
docs/LIGHTRAG_INTEGRATION.md
Normal file
@@ -0,0 +1,388 @@
|
||||
# LightRAG MCP Integration
|
||||
|
||||
**Status:** ✅ Production Ready
|
||||
**Version:** 1.0.0
|
||||
**Date:** 2025-09-30
|
||||
|
||||
## Overview
|
||||
|
||||
CHORUS now includes optional LightRAG integration for Retrieval-Augmented Generation (RAG) capabilities. LightRAG provides graph-based knowledge retrieval to enrich AI reasoning and context resolution.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Components
|
||||
|
||||
1. **LightRAG Client** (`pkg/mcp/lightrag_client.go`)
|
||||
- HTTP client for LightRAG MCP server
|
||||
- Supports 4 query modes: naive, local, global, hybrid
|
||||
- Health checking and document insertion
|
||||
- Configurable timeouts and API authentication
|
||||
|
||||
2. **Reasoning Engine Integration** (`reasoning/reasoning.go`)
|
||||
- `GenerateResponseWithRAG()` - RAG-enriched response generation
|
||||
- `GenerateResponseSmartWithRAG()` - Combines model selection + RAG
|
||||
- `SetLightRAGClient()` - Configure RAG client
|
||||
- Non-fatal error handling (degrades gracefully)
|
||||
|
||||
3. **SLURP Context Enrichment** (`pkg/slurp/context/lightrag.go`)
|
||||
- `LightRAGEnricher` - Enriches context nodes with RAG data
|
||||
- `EnrichContextNode()` - Add insights to individual nodes
|
||||
- `EnrichResolvedContext()` - Enrich resolved context chains
|
||||
- `InsertContextNode()` - Build knowledge base over time
|
||||
|
||||
4. **Configuration** (`pkg/config/config.go`)
|
||||
- `LightRAGConfig` struct with 5 configuration options
|
||||
- Environment variable support
|
||||
- Automatic initialization in runtime
|
||||
|
||||
## Configuration
|
||||
|
||||
### Environment Variables
|
||||
|
||||
```bash
|
||||
# Enable LightRAG integration
|
||||
CHORUS_LIGHTRAG_ENABLED=true
|
||||
|
||||
# LightRAG server endpoint
|
||||
CHORUS_LIGHTRAG_BASE_URL=http://127.0.0.1:9621
|
||||
|
||||
# Query timeout
|
||||
CHORUS_LIGHTRAG_TIMEOUT=30s
|
||||
|
||||
# Optional API key
|
||||
CHORUS_LIGHTRAG_API_KEY=your-api-key
|
||||
|
||||
# Default query mode (naive, local, global, hybrid)
|
||||
CHORUS_LIGHTRAG_DEFAULT_MODE=hybrid
|
||||
```
|
||||
|
||||
### Docker Configuration
|
||||
|
||||
```yaml
|
||||
services:
|
||||
chorus-agent:
|
||||
environment:
|
||||
- CHORUS_LIGHTRAG_ENABLED=true
|
||||
- CHORUS_LIGHTRAG_BASE_URL=http://lightrag:9621
|
||||
- CHORUS_LIGHTRAG_DEFAULT_MODE=hybrid
|
||||
depends_on:
|
||||
- lightrag
|
||||
|
||||
lightrag:
|
||||
image: lightrag/lightrag:latest
|
||||
ports:
|
||||
- "9621:9621"
|
||||
volumes:
|
||||
- lightrag-data:/app/data
|
||||
```
|
||||
|
||||
## Query Modes
|
||||
|
||||
LightRAG supports 4 query modes with different retrieval strategies:
|
||||
|
||||
1. **Naive Mode** (`QueryModeNaive`)
|
||||
- Simple semantic search
|
||||
- Fastest, least context
|
||||
- Use for: Quick lookups
|
||||
|
||||
2. **Local Mode** (`QueryModeLocal`)
|
||||
- Local graph traversal
|
||||
- Context from immediate neighbors
|
||||
- Use for: Related information
|
||||
|
||||
3. **Global Mode** (`QueryModeGlobal`)
|
||||
- Global graph analysis
|
||||
- Broad context from entire knowledge base
|
||||
- Use for: High-level questions
|
||||
|
||||
4. **Hybrid Mode** (`QueryModeHybrid`) ⭐ **Recommended**
|
||||
- Combined approach
|
||||
- Balances breadth and depth
|
||||
- Use for: General purpose RAG
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Reasoning Engine with RAG
|
||||
|
||||
```go
|
||||
import (
|
||||
"context"
|
||||
"chorus/reasoning"
|
||||
"chorus/pkg/mcp"
|
||||
)
|
||||
|
||||
// Initialize LightRAG client
|
||||
config := mcp.LightRAGConfig{
|
||||
BaseURL: "http://127.0.0.1:9621",
|
||||
Timeout: 30 * time.Second,
|
||||
}
|
||||
client := mcp.NewLightRAGClient(config)
|
||||
|
||||
// Configure reasoning engine
|
||||
reasoning.SetLightRAGClient(client)
|
||||
|
||||
// Generate RAG-enriched response
|
||||
ctx := context.Background()
|
||||
response, err := reasoning.GenerateResponseWithRAG(
|
||||
ctx,
|
||||
"meta/llama-3.1-8b-instruct",
|
||||
"How does CHORUS handle P2P networking?",
|
||||
mcp.QueryModeHybrid,
|
||||
)
|
||||
```
|
||||
|
||||
### SLURP Context Enrichment
|
||||
|
||||
```go
|
||||
import (
|
||||
"context"
|
||||
"chorus/pkg/slurp/context"
|
||||
"chorus/pkg/mcp"
|
||||
)
|
||||
|
||||
// Create enricher
|
||||
enricher := context.NewLightRAGEnricher(client, "hybrid")
|
||||
|
||||
// Enrich a context node
|
||||
node := &context.ContextNode{
|
||||
Path: "/pkg/p2p",
|
||||
Summary: "P2P networking implementation",
|
||||
Purpose: "Provides libp2p networking layer",
|
||||
}
|
||||
|
||||
err := enricher.EnrichContextNode(ctx, node)
|
||||
// node.Insights now contains RAG-retrieved information
|
||||
|
||||
// Insert for future retrieval
|
||||
err = enricher.InsertContextNode(ctx, node)
|
||||
```
|
||||
|
||||
### Direct LightRAG Client
|
||||
|
||||
```go
|
||||
import (
|
||||
"context"
|
||||
"chorus/pkg/mcp"
|
||||
)
|
||||
|
||||
client := mcp.NewLightRAGClient(config)
|
||||
|
||||
// Health check
|
||||
healthy := client.IsHealthy(ctx)
|
||||
|
||||
// Query with response
|
||||
response, err := client.Query(ctx, "query", mcp.QueryModeHybrid)
|
||||
|
||||
// Get context only
|
||||
context, err := client.GetContext(ctx, "query", mcp.QueryModeHybrid)
|
||||
|
||||
// Insert document
|
||||
err := client.Insert(ctx, "text content", "description")
|
||||
```
|
||||
|
||||
## Integration Points
|
||||
|
||||
### Runtime Initialization
|
||||
|
||||
LightRAG is initialized automatically in `internal/runtime/shared.go`:
|
||||
|
||||
```go
|
||||
// Line 685-704
|
||||
if cfg.LightRAG.Enabled {
|
||||
lightragConfig := mcp.LightRAGConfig{
|
||||
BaseURL: cfg.LightRAG.BaseURL,
|
||||
Timeout: cfg.LightRAG.Timeout,
|
||||
APIKey: cfg.LightRAG.APIKey,
|
||||
}
|
||||
lightragClient := mcp.NewLightRAGClient(lightragConfig)
|
||||
|
||||
if lightragClient.IsHealthy(ctx) {
|
||||
reasoning.SetLightRAGClient(lightragClient)
|
||||
logger.Info("📚 LightRAG RAG system enabled")
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Graceful Degradation
|
||||
|
||||
LightRAG integration is **completely optional** and **non-blocking**:
|
||||
|
||||
- If `CHORUS_LIGHTRAG_ENABLED=false`, no LightRAG calls are made
|
||||
- If LightRAG server is unavailable, health check fails gracefully
|
||||
- If RAG queries fail, reasoning engine falls back to non-RAG generation
|
||||
- SLURP enrichment failures are logged but don't block context resolution
|
||||
|
||||
## Testing
|
||||
|
||||
### Unit Tests
|
||||
|
||||
```bash
|
||||
# Run all LightRAG tests (requires running server)
|
||||
go test -v ./pkg/mcp/
|
||||
|
||||
# Run only unit tests (no server required)
|
||||
go test -v -short ./pkg/mcp/
|
||||
```
|
||||
|
||||
### Integration Tests
|
||||
|
||||
```bash
|
||||
# Start LightRAG server
|
||||
cd ~/chorus/mcp-include/LightRAG
|
||||
python main.py
|
||||
|
||||
# Run integration tests
|
||||
cd ~/chorus/project-queues/active/CHORUS
|
||||
go test -v ./pkg/mcp/ -run TestLightRAGClient
|
||||
```
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
### Query Timeouts
|
||||
|
||||
- Default: 30 seconds
|
||||
- Hybrid mode is slowest (analyzes entire graph)
|
||||
- Naive mode is fastest (simple semantic search)
|
||||
|
||||
### Caching
|
||||
|
||||
LightRAG includes internal caching:
|
||||
- Repeated queries return cached results
|
||||
- Cache TTL managed by LightRAG server
|
||||
- No CHORUS-side caching required
|
||||
|
||||
### Resource Usage
|
||||
|
||||
- Memory: Proportional to knowledge base size
|
||||
- CPU: Query modes have different compute requirements
|
||||
- Network: HTTP requests to LightRAG server
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Server Not Healthy
|
||||
|
||||
**Symptom:** `LightRAG enabled but server not healthy`
|
||||
|
||||
**Solutions:**
|
||||
1. Check if LightRAG server is running: `curl http://127.0.0.1:9621/health`
|
||||
2. Verify correct port in `CHORUS_LIGHTRAG_BASE_URL`
|
||||
3. Check LightRAG logs for errors
|
||||
4. Ensure network connectivity between CHORUS and LightRAG
|
||||
|
||||
### Empty Responses
|
||||
|
||||
**Symptom:** RAG queries return empty results
|
||||
|
||||
**Solutions:**
|
||||
1. Knowledge base may be empty - insert documents first
|
||||
2. Query may not match indexed content
|
||||
3. Try different query mode (hybrid recommended)
|
||||
4. Check LightRAG indexing logs
|
||||
|
||||
### Timeout Errors
|
||||
|
||||
**Symptom:** `context deadline exceeded`
|
||||
|
||||
**Solutions:**
|
||||
1. Increase `CHORUS_LIGHTRAG_TIMEOUT`
|
||||
2. Use faster query mode (naive or local)
|
||||
3. Optimize LightRAG server performance
|
||||
4. Check network latency
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### API Authentication
|
||||
|
||||
Optional API key support:
|
||||
```bash
|
||||
CHORUS_LIGHTRAG_API_KEY=your-secret-key
|
||||
```
|
||||
|
||||
Keys are sent as Bearer tokens in Authorization header.
|
||||
|
||||
### Network Security
|
||||
|
||||
- Run LightRAG on internal network only
|
||||
- Use HTTPS for production deployments
|
||||
- Consider firewall rules to restrict access
|
||||
- LightRAG doesn't include built-in encryption
|
||||
|
||||
### Data Privacy
|
||||
|
||||
- All queries and documents are stored in LightRAG
|
||||
- Consider what data is being indexed
|
||||
- Implement data retention policies
|
||||
- Use access control on LightRAG server
|
||||
|
||||
## Monitoring
|
||||
|
||||
### Health Checks
|
||||
|
||||
```go
|
||||
// Check LightRAG availability
|
||||
if client.IsHealthy(ctx) {
|
||||
// Server is healthy
|
||||
}
|
||||
|
||||
// Get detailed health info
|
||||
health, err := client.Health(ctx)
|
||||
// Returns: Status, CoreVersion, APIVersion, etc.
|
||||
```
|
||||
|
||||
### Metrics
|
||||
|
||||
Consider adding:
|
||||
- RAG query latency
|
||||
- Cache hit rates
|
||||
- Enrichment success/failure rates
|
||||
- Knowledge base size
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
Potential improvements:
|
||||
|
||||
1. **Batch Query Optimization**
|
||||
- Batch multiple RAG queries together
|
||||
- Reduce HTTP overhead
|
||||
|
||||
2. **Adaptive Query Mode Selection**
|
||||
- Automatically choose query mode based on question type
|
||||
- Learn from past query performance
|
||||
|
||||
3. **Knowledge Base Management**
|
||||
- Automated document insertion from SLURP contexts
|
||||
- Background indexing of code repositories
|
||||
- Scheduled knowledge base updates
|
||||
|
||||
4. **Advanced Caching**
|
||||
- CHORUS-side caching with TTL
|
||||
- Semantic cache (similar queries share cache)
|
||||
- Persistent cache across restarts
|
||||
|
||||
5. **Multi-tenant Support**
|
||||
- Per-agent knowledge bases
|
||||
- Role-based access to documents
|
||||
- Encrypted knowledge storage
|
||||
|
||||
## Files Changed
|
||||
|
||||
1. `pkg/mcp/lightrag_client.go` - NEW (277 lines)
|
||||
2. `pkg/mcp/lightrag_client_test.go` - NEW (239 lines)
|
||||
3. `pkg/config/config.go` - Modified (added LightRAGConfig)
|
||||
4. `reasoning/reasoning.go` - Modified (added RAG functions)
|
||||
5. `internal/runtime/shared.go` - Modified (added initialization)
|
||||
6. `pkg/slurp/context/lightrag.go` - NEW (203 lines)
|
||||
|
||||
**Total:** 3 new files, 3 modified files, ~750 lines of code
|
||||
|
||||
## References
|
||||
|
||||
- LightRAG Documentation: https://github.com/HKUDS/LightRAG
|
||||
- MCP Protocol Spec: https://spec.modelcontextprotocol.io
|
||||
- CHORUS Documentation: `docs/comprehensive/`
|
||||
|
||||
---
|
||||
|
||||
**Maintainer:** CHORUS Project Team
|
||||
**Last Updated:** 2025-09-30
|
||||
**Status:** Production Ready
|
||||
Reference in New Issue
Block a user