Files

anthonyrawlins 63dab5c4d4 Add LightRAG MCP integration for RAG-enhanced AI reasoning

This commit integrates LightRAG (Retrieval-Augmented Generation) MCP server
support into CHORUS, enabling graph-based knowledge retrieval to enrich AI
reasoning and context resolution.

## New Components

1. **LightRAG Client** (pkg/mcp/lightrag_client.go)
   - HTTP client for LightRAG MCP server
   - Supports 4 query modes: naive, local, global, hybrid
   - Health checking, document insertion, context retrieval
   - 277 lines with comprehensive error handling

2. **Integration Tests** (pkg/mcp/lightrag_client_test.go)
   - Unit and integration tests
   - Tests all query modes and operations
   - 239 lines with detailed test cases

3. **SLURP Context Enricher** (pkg/slurp/context/lightrag.go)
   - Enriches SLURP context nodes with RAG data
   - Batch processing support
   - Knowledge base building over time
   - 203 lines

4. **Documentation** (docs/LIGHTRAG_INTEGRATION.md)
   - Complete integration guide
   - Configuration examples
   - Usage patterns and troubleshooting
   - 350+ lines

## Modified Components

1. **Configuration** (pkg/config/config.go)
   - Added LightRAGConfig struct
   - Environment variable support (5 variables)
   - Default configuration with hybrid mode

2. **Reasoning Engine** (reasoning/reasoning.go)
   - GenerateResponseWithRAG() - RAG-enriched generation
   - GenerateResponseSmartWithRAG() - Smart model + RAG
   - SetLightRAGClient() - Client configuration
   - Non-fatal error handling (graceful degradation)

3. **Runtime Initialization** (internal/runtime/shared.go)
   - Automatic LightRAG client setup
   - Health check on startup
   - Integration with reasoning engine

## Configuration

Environment variables:
- CHORUS_LIGHTRAG_ENABLED (default: false)
- CHORUS_LIGHTRAG_BASE_URL (default: http://127.0.0.1:9621)
- CHORUS_LIGHTRAG_TIMEOUT (default: 30s)
- CHORUS_LIGHTRAG_API_KEY (optional)
- CHORUS_LIGHTRAG_DEFAULT_MODE (default: hybrid)

## Features

- ✅ Optional and non-blocking (graceful degradation)
- ✅ Four query modes for different use cases
- ✅ Context enrichment for SLURP system
- ✅ Knowledge base building over time
- ✅ Health monitoring and error handling
- ✅ Comprehensive tests and documentation

## Testing

LightRAG server tested at http://127.0.0.1:9621
- Health check: ✅ Passed
- Query operations: ✅ Tested
- Integration points: ✅ Verified

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-09-30 23:56:09 +10:00

9.2 KiB

Raw Blame History

LightRAG MCP Integration

Status: ✅ Production Ready Version: 1.0.0 Date: 2025-09-30

Overview

CHORUS now includes optional LightRAG integration for Retrieval-Augmented Generation (RAG) capabilities. LightRAG provides graph-based knowledge retrieval to enrich AI reasoning and context resolution.

Architecture

Components

LightRAG Client (pkg/mcp/lightrag_client.go)
- HTTP client for LightRAG MCP server
- Supports 4 query modes: naive, local, global, hybrid
- Health checking and document insertion
- Configurable timeouts and API authentication
Reasoning Engine Integration (reasoning/reasoning.go)
- GenerateResponseWithRAG() - RAG-enriched response generation
- GenerateResponseSmartWithRAG() - Combines model selection + RAG
- SetLightRAGClient() - Configure RAG client
- Non-fatal error handling (degrades gracefully)
SLURP Context Enrichment (pkg/slurp/context/lightrag.go)
- LightRAGEnricher - Enriches context nodes with RAG data
- EnrichContextNode() - Add insights to individual nodes
- EnrichResolvedContext() - Enrich resolved context chains
- InsertContextNode() - Build knowledge base over time
Configuration (pkg/config/config.go)
- LightRAGConfig struct with 5 configuration options
- Environment variable support
- Automatic initialization in runtime

Configuration

Environment Variables

# Enable LightRAG integration
CHORUS_LIGHTRAG_ENABLED=true

# LightRAG server endpoint
CHORUS_LIGHTRAG_BASE_URL=http://127.0.0.1:9621

# Query timeout
CHORUS_LIGHTRAG_TIMEOUT=30s

# Optional API key
CHORUS_LIGHTRAG_API_KEY=your-api-key

# Default query mode (naive, local, global, hybrid)
CHORUS_LIGHTRAG_DEFAULT_MODE=hybrid

Docker Configuration

services:
  chorus-agent:
    environment:
      - CHORUS_LIGHTRAG_ENABLED=true
      - CHORUS_LIGHTRAG_BASE_URL=http://lightrag:9621
      - CHORUS_LIGHTRAG_DEFAULT_MODE=hybrid
    depends_on:
      - lightrag

  lightrag:
    image: lightrag/lightrag:latest
    ports:
      - "9621:9621"
    volumes:
      - lightrag-data:/app/data

Query Modes

LightRAG supports 4 query modes with different retrieval strategies:

Naive Mode (QueryModeNaive)
- Simple semantic search
- Fastest, least context
- Use for: Quick lookups
Local Mode (QueryModeLocal)
- Local graph traversal
- Context from immediate neighbors
- Use for: Related information
Global Mode (QueryModeGlobal)
- Global graph analysis
- Broad context from entire knowledge base
- Use for: High-level questions
Hybrid Mode (QueryModeHybrid) ⭐ Recommended
- Combined approach
- Balances breadth and depth
- Use for: General purpose RAG

Usage Examples

Reasoning Engine with RAG

import (
    "context"
    "chorus/reasoning"
    "chorus/pkg/mcp"
)

// Initialize LightRAG client
config := mcp.LightRAGConfig{
    BaseURL: "http://127.0.0.1:9621",
    Timeout: 30 * time.Second,
}
client := mcp.NewLightRAGClient(config)

// Configure reasoning engine
reasoning.SetLightRAGClient(client)

// Generate RAG-enriched response
ctx := context.Background()
response, err := reasoning.GenerateResponseWithRAG(
    ctx,
    "meta/llama-3.1-8b-instruct",
    "How does CHORUS handle P2P networking?",
    mcp.QueryModeHybrid,
)

SLURP Context Enrichment

import (
    "context"
    "chorus/pkg/slurp/context"
    "chorus/pkg/mcp"
)

// Create enricher
enricher := context.NewLightRAGEnricher(client, "hybrid")

// Enrich a context node
node := &context.ContextNode{
    Path:    "/pkg/p2p",
    Summary: "P2P networking implementation",
    Purpose: "Provides libp2p networking layer",
}

err := enricher.EnrichContextNode(ctx, node)
// node.Insights now contains RAG-retrieved information

// Insert for future retrieval
err = enricher.InsertContextNode(ctx, node)

Direct LightRAG Client

import (
    "context"
    "chorus/pkg/mcp"
)

client := mcp.NewLightRAGClient(config)

// Health check
healthy := client.IsHealthy(ctx)

// Query with response
response, err := client.Query(ctx, "query", mcp.QueryModeHybrid)

// Get context only
context, err := client.GetContext(ctx, "query", mcp.QueryModeHybrid)

// Insert document
err := client.Insert(ctx, "text content", "description")

Integration Points

Runtime Initialization

LightRAG is initialized automatically in internal/runtime/shared.go:

// Line 685-704
if cfg.LightRAG.Enabled {
    lightragConfig := mcp.LightRAGConfig{
        BaseURL: cfg.LightRAG.BaseURL,
        Timeout: cfg.LightRAG.Timeout,
        APIKey:  cfg.LightRAG.APIKey,
    }
    lightragClient := mcp.NewLightRAGClient(lightragConfig)

    if lightragClient.IsHealthy(ctx) {
        reasoning.SetLightRAGClient(lightragClient)
        logger.Info("📚 LightRAG RAG system enabled")
    }
}

Graceful Degradation

LightRAG integration is completely optional and non-blocking:

If CHORUS_LIGHTRAG_ENABLED=false, no LightRAG calls are made
If LightRAG server is unavailable, health check fails gracefully
If RAG queries fail, reasoning engine falls back to non-RAG generation
SLURP enrichment failures are logged but don't block context resolution

Testing

Unit Tests

# Run all LightRAG tests (requires running server)
go test -v ./pkg/mcp/

# Run only unit tests (no server required)
go test -v -short ./pkg/mcp/

Integration Tests

# Start LightRAG server
cd ~/chorus/mcp-include/LightRAG
python main.py

# Run integration tests
cd ~/chorus/project-queues/active/CHORUS
go test -v ./pkg/mcp/ -run TestLightRAGClient

Performance Considerations

Query Timeouts

Default: 30 seconds
Hybrid mode is slowest (analyzes entire graph)
Naive mode is fastest (simple semantic search)

Caching

LightRAG includes internal caching:

Repeated queries return cached results
Cache TTL managed by LightRAG server
No CHORUS-side caching required

Resource Usage

Memory: Proportional to knowledge base size
CPU: Query modes have different compute requirements
Network: HTTP requests to LightRAG server

Troubleshooting

Server Not Healthy

Symptom: LightRAG enabled but server not healthy

Solutions:

Check if LightRAG server is running: curl http://127.0.0.1:9621/health
Verify correct port in CHORUS_LIGHTRAG_BASE_URL
Check LightRAG logs for errors
Ensure network connectivity between CHORUS and LightRAG

Empty Responses

Symptom: RAG queries return empty results

Solutions:

Knowledge base may be empty - insert documents first
Query may not match indexed content
Try different query mode (hybrid recommended)
Check LightRAG indexing logs

Timeout Errors

Symptom: context deadline exceeded

Solutions:

Increase CHORUS_LIGHTRAG_TIMEOUT
Use faster query mode (naive or local)
Optimize LightRAG server performance
Check network latency

Security Considerations

API Authentication

Optional API key support:

CHORUS_LIGHTRAG_API_KEY=your-secret-key

Keys are sent as Bearer tokens in Authorization header.

Network Security

Run LightRAG on internal network only
Use HTTPS for production deployments
Consider firewall rules to restrict access
LightRAG doesn't include built-in encryption

Data Privacy

All queries and documents are stored in LightRAG
Consider what data is being indexed
Implement data retention policies
Use access control on LightRAG server

Monitoring

Health Checks

// Check LightRAG availability
if client.IsHealthy(ctx) {
    // Server is healthy
}

// Get detailed health info
health, err := client.Health(ctx)
// Returns: Status, CoreVersion, APIVersion, etc.

Metrics

Consider adding:

RAG query latency
Cache hit rates
Enrichment success/failure rates
Knowledge base size

Future Enhancements

Potential improvements:

Batch Query Optimization
- Batch multiple RAG queries together
- Reduce HTTP overhead
Adaptive Query Mode Selection
- Automatically choose query mode based on question type
- Learn from past query performance
Knowledge Base Management
- Automated document insertion from SLURP contexts
- Background indexing of code repositories
- Scheduled knowledge base updates
Advanced Caching
- CHORUS-side caching with TTL
- Semantic cache (similar queries share cache)
- Persistent cache across restarts
Multi-tenant Support
- Per-agent knowledge bases
- Role-based access to documents
- Encrypted knowledge storage

Files Changed

pkg/mcp/lightrag_client.go - NEW (277 lines)
pkg/mcp/lightrag_client_test.go - NEW (239 lines)
pkg/config/config.go - Modified (added LightRAGConfig)
reasoning/reasoning.go - Modified (added RAG functions)
internal/runtime/shared.go - Modified (added initialization)
pkg/slurp/context/lightrag.go - NEW (203 lines)

Total: 3 new files, 3 modified files, ~750 lines of code

References

LightRAG Documentation: https://github.com/HKUDS/LightRAG
MCP Protocol Spec: https://spec.modelcontextprotocol.io
CHORUS Documentation: docs/comprehensive/

Maintainer: CHORUS Project Team Last Updated: 2025-09-30 Status: Production Ready

9.2 KiB Raw Blame History

LightRAG MCP Integration

Overview

Architecture

Components

Configuration

Environment Variables

Docker Configuration

Query Modes

Usage Examples

Reasoning Engine with RAG

SLURP Context Enrichment

Direct LightRAG Client

Integration Points

Runtime Initialization

Graceful Degradation

Testing

Unit Tests

Integration Tests

Performance Considerations

Query Timeouts

Caching

Resource Usage

Troubleshooting

Server Not Healthy

Empty Responses

Timeout Errors

Security Considerations

API Authentication

Network Security

Data Privacy

Monitoring

Health Checks

Metrics

Future Enhancements

Files Changed

References

9.2 KiB

Raw Blame History