Files
CHORUS/docs/LIGHTRAG_INTEGRATION.md
anthonyrawlins 63dab5c4d4 Add LightRAG MCP integration for RAG-enhanced AI reasoning
This commit integrates LightRAG (Retrieval-Augmented Generation) MCP server
support into CHORUS, enabling graph-based knowledge retrieval to enrich AI
reasoning and context resolution.

## New Components

1. **LightRAG Client** (pkg/mcp/lightrag_client.go)
   - HTTP client for LightRAG MCP server
   - Supports 4 query modes: naive, local, global, hybrid
   - Health checking, document insertion, context retrieval
   - 277 lines with comprehensive error handling

2. **Integration Tests** (pkg/mcp/lightrag_client_test.go)
   - Unit and integration tests
   - Tests all query modes and operations
   - 239 lines with detailed test cases

3. **SLURP Context Enricher** (pkg/slurp/context/lightrag.go)
   - Enriches SLURP context nodes with RAG data
   - Batch processing support
   - Knowledge base building over time
   - 203 lines

4. **Documentation** (docs/LIGHTRAG_INTEGRATION.md)
   - Complete integration guide
   - Configuration examples
   - Usage patterns and troubleshooting
   - 350+ lines

## Modified Components

1. **Configuration** (pkg/config/config.go)
   - Added LightRAGConfig struct
   - Environment variable support (5 variables)
   - Default configuration with hybrid mode

2. **Reasoning Engine** (reasoning/reasoning.go)
   - GenerateResponseWithRAG() - RAG-enriched generation
   - GenerateResponseSmartWithRAG() - Smart model + RAG
   - SetLightRAGClient() - Client configuration
   - Non-fatal error handling (graceful degradation)

3. **Runtime Initialization** (internal/runtime/shared.go)
   - Automatic LightRAG client setup
   - Health check on startup
   - Integration with reasoning engine

## Configuration

Environment variables:
- CHORUS_LIGHTRAG_ENABLED (default: false)
- CHORUS_LIGHTRAG_BASE_URL (default: http://127.0.0.1:9621)
- CHORUS_LIGHTRAG_TIMEOUT (default: 30s)
- CHORUS_LIGHTRAG_API_KEY (optional)
- CHORUS_LIGHTRAG_DEFAULT_MODE (default: hybrid)

## Features

-  Optional and non-blocking (graceful degradation)
-  Four query modes for different use cases
-  Context enrichment for SLURP system
-  Knowledge base building over time
-  Health monitoring and error handling
-  Comprehensive tests and documentation

## Testing

LightRAG server tested at http://127.0.0.1:9621
- Health check:  Passed
- Query operations:  Tested
- Integration points:  Verified

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-30 23:56:09 +10:00

388 lines
9.2 KiB
Markdown

# LightRAG MCP Integration
**Status:** ✅ Production Ready
**Version:** 1.0.0
**Date:** 2025-09-30
## Overview
CHORUS now includes optional LightRAG integration for Retrieval-Augmented Generation (RAG) capabilities. LightRAG provides graph-based knowledge retrieval to enrich AI reasoning and context resolution.
## Architecture
### Components
1. **LightRAG Client** (`pkg/mcp/lightrag_client.go`)
- HTTP client for LightRAG MCP server
- Supports 4 query modes: naive, local, global, hybrid
- Health checking and document insertion
- Configurable timeouts and API authentication
2. **Reasoning Engine Integration** (`reasoning/reasoning.go`)
- `GenerateResponseWithRAG()` - RAG-enriched response generation
- `GenerateResponseSmartWithRAG()` - Combines model selection + RAG
- `SetLightRAGClient()` - Configure RAG client
- Non-fatal error handling (degrades gracefully)
3. **SLURP Context Enrichment** (`pkg/slurp/context/lightrag.go`)
- `LightRAGEnricher` - Enriches context nodes with RAG data
- `EnrichContextNode()` - Add insights to individual nodes
- `EnrichResolvedContext()` - Enrich resolved context chains
- `InsertContextNode()` - Build knowledge base over time
4. **Configuration** (`pkg/config/config.go`)
- `LightRAGConfig` struct with 5 configuration options
- Environment variable support
- Automatic initialization in runtime
## Configuration
### Environment Variables
```bash
# Enable LightRAG integration
CHORUS_LIGHTRAG_ENABLED=true
# LightRAG server endpoint
CHORUS_LIGHTRAG_BASE_URL=http://127.0.0.1:9621
# Query timeout
CHORUS_LIGHTRAG_TIMEOUT=30s
# Optional API key
CHORUS_LIGHTRAG_API_KEY=your-api-key
# Default query mode (naive, local, global, hybrid)
CHORUS_LIGHTRAG_DEFAULT_MODE=hybrid
```
### Docker Configuration
```yaml
services:
chorus-agent:
environment:
- CHORUS_LIGHTRAG_ENABLED=true
- CHORUS_LIGHTRAG_BASE_URL=http://lightrag:9621
- CHORUS_LIGHTRAG_DEFAULT_MODE=hybrid
depends_on:
- lightrag
lightrag:
image: lightrag/lightrag:latest
ports:
- "9621:9621"
volumes:
- lightrag-data:/app/data
```
## Query Modes
LightRAG supports 4 query modes with different retrieval strategies:
1. **Naive Mode** (`QueryModeNaive`)
- Simple semantic search
- Fastest, least context
- Use for: Quick lookups
2. **Local Mode** (`QueryModeLocal`)
- Local graph traversal
- Context from immediate neighbors
- Use for: Related information
3. **Global Mode** (`QueryModeGlobal`)
- Global graph analysis
- Broad context from entire knowledge base
- Use for: High-level questions
4. **Hybrid Mode** (`QueryModeHybrid`) ⭐ **Recommended**
- Combined approach
- Balances breadth and depth
- Use for: General purpose RAG
## Usage Examples
### Reasoning Engine with RAG
```go
import (
"context"
"chorus/reasoning"
"chorus/pkg/mcp"
)
// Initialize LightRAG client
config := mcp.LightRAGConfig{
BaseURL: "http://127.0.0.1:9621",
Timeout: 30 * time.Second,
}
client := mcp.NewLightRAGClient(config)
// Configure reasoning engine
reasoning.SetLightRAGClient(client)
// Generate RAG-enriched response
ctx := context.Background()
response, err := reasoning.GenerateResponseWithRAG(
ctx,
"meta/llama-3.1-8b-instruct",
"How does CHORUS handle P2P networking?",
mcp.QueryModeHybrid,
)
```
### SLURP Context Enrichment
```go
import (
"context"
"chorus/pkg/slurp/context"
"chorus/pkg/mcp"
)
// Create enricher
enricher := context.NewLightRAGEnricher(client, "hybrid")
// Enrich a context node
node := &context.ContextNode{
Path: "/pkg/p2p",
Summary: "P2P networking implementation",
Purpose: "Provides libp2p networking layer",
}
err := enricher.EnrichContextNode(ctx, node)
// node.Insights now contains RAG-retrieved information
// Insert for future retrieval
err = enricher.InsertContextNode(ctx, node)
```
### Direct LightRAG Client
```go
import (
"context"
"chorus/pkg/mcp"
)
client := mcp.NewLightRAGClient(config)
// Health check
healthy := client.IsHealthy(ctx)
// Query with response
response, err := client.Query(ctx, "query", mcp.QueryModeHybrid)
// Get context only
context, err := client.GetContext(ctx, "query", mcp.QueryModeHybrid)
// Insert document
err := client.Insert(ctx, "text content", "description")
```
## Integration Points
### Runtime Initialization
LightRAG is initialized automatically in `internal/runtime/shared.go`:
```go
// Line 685-704
if cfg.LightRAG.Enabled {
lightragConfig := mcp.LightRAGConfig{
BaseURL: cfg.LightRAG.BaseURL,
Timeout: cfg.LightRAG.Timeout,
APIKey: cfg.LightRAG.APIKey,
}
lightragClient := mcp.NewLightRAGClient(lightragConfig)
if lightragClient.IsHealthy(ctx) {
reasoning.SetLightRAGClient(lightragClient)
logger.Info("📚 LightRAG RAG system enabled")
}
}
```
### Graceful Degradation
LightRAG integration is **completely optional** and **non-blocking**:
- If `CHORUS_LIGHTRAG_ENABLED=false`, no LightRAG calls are made
- If LightRAG server is unavailable, health check fails gracefully
- If RAG queries fail, reasoning engine falls back to non-RAG generation
- SLURP enrichment failures are logged but don't block context resolution
## Testing
### Unit Tests
```bash
# Run all LightRAG tests (requires running server)
go test -v ./pkg/mcp/
# Run only unit tests (no server required)
go test -v -short ./pkg/mcp/
```
### Integration Tests
```bash
# Start LightRAG server
cd ~/chorus/mcp-include/LightRAG
python main.py
# Run integration tests
cd ~/chorus/project-queues/active/CHORUS
go test -v ./pkg/mcp/ -run TestLightRAGClient
```
## Performance Considerations
### Query Timeouts
- Default: 30 seconds
- Hybrid mode is slowest (analyzes entire graph)
- Naive mode is fastest (simple semantic search)
### Caching
LightRAG includes internal caching:
- Repeated queries return cached results
- Cache TTL managed by LightRAG server
- No CHORUS-side caching required
### Resource Usage
- Memory: Proportional to knowledge base size
- CPU: Query modes have different compute requirements
- Network: HTTP requests to LightRAG server
## Troubleshooting
### Server Not Healthy
**Symptom:** `LightRAG enabled but server not healthy`
**Solutions:**
1. Check if LightRAG server is running: `curl http://127.0.0.1:9621/health`
2. Verify correct port in `CHORUS_LIGHTRAG_BASE_URL`
3. Check LightRAG logs for errors
4. Ensure network connectivity between CHORUS and LightRAG
### Empty Responses
**Symptom:** RAG queries return empty results
**Solutions:**
1. Knowledge base may be empty - insert documents first
2. Query may not match indexed content
3. Try different query mode (hybrid recommended)
4. Check LightRAG indexing logs
### Timeout Errors
**Symptom:** `context deadline exceeded`
**Solutions:**
1. Increase `CHORUS_LIGHTRAG_TIMEOUT`
2. Use faster query mode (naive or local)
3. Optimize LightRAG server performance
4. Check network latency
## Security Considerations
### API Authentication
Optional API key support:
```bash
CHORUS_LIGHTRAG_API_KEY=your-secret-key
```
Keys are sent as Bearer tokens in Authorization header.
### Network Security
- Run LightRAG on internal network only
- Use HTTPS for production deployments
- Consider firewall rules to restrict access
- LightRAG doesn't include built-in encryption
### Data Privacy
- All queries and documents are stored in LightRAG
- Consider what data is being indexed
- Implement data retention policies
- Use access control on LightRAG server
## Monitoring
### Health Checks
```go
// Check LightRAG availability
if client.IsHealthy(ctx) {
// Server is healthy
}
// Get detailed health info
health, err := client.Health(ctx)
// Returns: Status, CoreVersion, APIVersion, etc.
```
### Metrics
Consider adding:
- RAG query latency
- Cache hit rates
- Enrichment success/failure rates
- Knowledge base size
## Future Enhancements
Potential improvements:
1. **Batch Query Optimization**
- Batch multiple RAG queries together
- Reduce HTTP overhead
2. **Adaptive Query Mode Selection**
- Automatically choose query mode based on question type
- Learn from past query performance
3. **Knowledge Base Management**
- Automated document insertion from SLURP contexts
- Background indexing of code repositories
- Scheduled knowledge base updates
4. **Advanced Caching**
- CHORUS-side caching with TTL
- Semantic cache (similar queries share cache)
- Persistent cache across restarts
5. **Multi-tenant Support**
- Per-agent knowledge bases
- Role-based access to documents
- Encrypted knowledge storage
## Files Changed
1. `pkg/mcp/lightrag_client.go` - NEW (277 lines)
2. `pkg/mcp/lightrag_client_test.go` - NEW (239 lines)
3. `pkg/config/config.go` - Modified (added LightRAGConfig)
4. `reasoning/reasoning.go` - Modified (added RAG functions)
5. `internal/runtime/shared.go` - Modified (added initialization)
6. `pkg/slurp/context/lightrag.go` - NEW (203 lines)
**Total:** 3 new files, 3 modified files, ~750 lines of code
## References
- LightRAG Documentation: https://github.com/HKUDS/LightRAG
- MCP Protocol Spec: https://spec.modelcontextprotocol.io
- CHORUS Documentation: `docs/comprehensive/`
---
**Maintainer:** CHORUS Project Team
**Last Updated:** 2025-09-30
**Status:** Production Ready