Files
CHORUS/docs/LIGHTRAG_INTEGRATION.md
anthonyrawlins 63dab5c4d4 Add LightRAG MCP integration for RAG-enhanced AI reasoning
This commit integrates LightRAG (Retrieval-Augmented Generation) MCP server
support into CHORUS, enabling graph-based knowledge retrieval to enrich AI
reasoning and context resolution.

## New Components

1. **LightRAG Client** (pkg/mcp/lightrag_client.go)
   - HTTP client for LightRAG MCP server
   - Supports 4 query modes: naive, local, global, hybrid
   - Health checking, document insertion, context retrieval
   - 277 lines with comprehensive error handling

2. **Integration Tests** (pkg/mcp/lightrag_client_test.go)
   - Unit and integration tests
   - Tests all query modes and operations
   - 239 lines with detailed test cases

3. **SLURP Context Enricher** (pkg/slurp/context/lightrag.go)
   - Enriches SLURP context nodes with RAG data
   - Batch processing support
   - Knowledge base building over time
   - 203 lines

4. **Documentation** (docs/LIGHTRAG_INTEGRATION.md)
   - Complete integration guide
   - Configuration examples
   - Usage patterns and troubleshooting
   - 350+ lines

## Modified Components

1. **Configuration** (pkg/config/config.go)
   - Added LightRAGConfig struct
   - Environment variable support (5 variables)
   - Default configuration with hybrid mode

2. **Reasoning Engine** (reasoning/reasoning.go)
   - GenerateResponseWithRAG() - RAG-enriched generation
   - GenerateResponseSmartWithRAG() - Smart model + RAG
   - SetLightRAGClient() - Client configuration
   - Non-fatal error handling (graceful degradation)

3. **Runtime Initialization** (internal/runtime/shared.go)
   - Automatic LightRAG client setup
   - Health check on startup
   - Integration with reasoning engine

## Configuration

Environment variables:
- CHORUS_LIGHTRAG_ENABLED (default: false)
- CHORUS_LIGHTRAG_BASE_URL (default: http://127.0.0.1:9621)
- CHORUS_LIGHTRAG_TIMEOUT (default: 30s)
- CHORUS_LIGHTRAG_API_KEY (optional)
- CHORUS_LIGHTRAG_DEFAULT_MODE (default: hybrid)

## Features

-  Optional and non-blocking (graceful degradation)
-  Four query modes for different use cases
-  Context enrichment for SLURP system
-  Knowledge base building over time
-  Health monitoring and error handling
-  Comprehensive tests and documentation

## Testing

LightRAG server tested at http://127.0.0.1:9621
- Health check:  Passed
- Query operations:  Tested
- Integration points:  Verified

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-30 23:56:09 +10:00

9.2 KiB

LightRAG MCP Integration

Status: Production Ready Version: 1.0.0 Date: 2025-09-30

Overview

CHORUS now includes optional LightRAG integration for Retrieval-Augmented Generation (RAG) capabilities. LightRAG provides graph-based knowledge retrieval to enrich AI reasoning and context resolution.

Architecture

Components

  1. LightRAG Client (pkg/mcp/lightrag_client.go)

    • HTTP client for LightRAG MCP server
    • Supports 4 query modes: naive, local, global, hybrid
    • Health checking and document insertion
    • Configurable timeouts and API authentication
  2. Reasoning Engine Integration (reasoning/reasoning.go)

    • GenerateResponseWithRAG() - RAG-enriched response generation
    • GenerateResponseSmartWithRAG() - Combines model selection + RAG
    • SetLightRAGClient() - Configure RAG client
    • Non-fatal error handling (degrades gracefully)
  3. SLURP Context Enrichment (pkg/slurp/context/lightrag.go)

    • LightRAGEnricher - Enriches context nodes with RAG data
    • EnrichContextNode() - Add insights to individual nodes
    • EnrichResolvedContext() - Enrich resolved context chains
    • InsertContextNode() - Build knowledge base over time
  4. Configuration (pkg/config/config.go)

    • LightRAGConfig struct with 5 configuration options
    • Environment variable support
    • Automatic initialization in runtime

Configuration

Environment Variables

# Enable LightRAG integration
CHORUS_LIGHTRAG_ENABLED=true

# LightRAG server endpoint
CHORUS_LIGHTRAG_BASE_URL=http://127.0.0.1:9621

# Query timeout
CHORUS_LIGHTRAG_TIMEOUT=30s

# Optional API key
CHORUS_LIGHTRAG_API_KEY=your-api-key

# Default query mode (naive, local, global, hybrid)
CHORUS_LIGHTRAG_DEFAULT_MODE=hybrid

Docker Configuration

services:
  chorus-agent:
    environment:
      - CHORUS_LIGHTRAG_ENABLED=true
      - CHORUS_LIGHTRAG_BASE_URL=http://lightrag:9621
      - CHORUS_LIGHTRAG_DEFAULT_MODE=hybrid
    depends_on:
      - lightrag

  lightrag:
    image: lightrag/lightrag:latest
    ports:
      - "9621:9621"
    volumes:
      - lightrag-data:/app/data

Query Modes

LightRAG supports 4 query modes with different retrieval strategies:

  1. Naive Mode (QueryModeNaive)

    • Simple semantic search
    • Fastest, least context
    • Use for: Quick lookups
  2. Local Mode (QueryModeLocal)

    • Local graph traversal
    • Context from immediate neighbors
    • Use for: Related information
  3. Global Mode (QueryModeGlobal)

    • Global graph analysis
    • Broad context from entire knowledge base
    • Use for: High-level questions
  4. Hybrid Mode (QueryModeHybrid) Recommended

    • Combined approach
    • Balances breadth and depth
    • Use for: General purpose RAG

Usage Examples

Reasoning Engine with RAG

import (
    "context"
    "chorus/reasoning"
    "chorus/pkg/mcp"
)

// Initialize LightRAG client
config := mcp.LightRAGConfig{
    BaseURL: "http://127.0.0.1:9621",
    Timeout: 30 * time.Second,
}
client := mcp.NewLightRAGClient(config)

// Configure reasoning engine
reasoning.SetLightRAGClient(client)

// Generate RAG-enriched response
ctx := context.Background()
response, err := reasoning.GenerateResponseWithRAG(
    ctx,
    "meta/llama-3.1-8b-instruct",
    "How does CHORUS handle P2P networking?",
    mcp.QueryModeHybrid,
)

SLURP Context Enrichment

import (
    "context"
    "chorus/pkg/slurp/context"
    "chorus/pkg/mcp"
)

// Create enricher
enricher := context.NewLightRAGEnricher(client, "hybrid")

// Enrich a context node
node := &context.ContextNode{
    Path:    "/pkg/p2p",
    Summary: "P2P networking implementation",
    Purpose: "Provides libp2p networking layer",
}

err := enricher.EnrichContextNode(ctx, node)
// node.Insights now contains RAG-retrieved information

// Insert for future retrieval
err = enricher.InsertContextNode(ctx, node)

Direct LightRAG Client

import (
    "context"
    "chorus/pkg/mcp"
)

client := mcp.NewLightRAGClient(config)

// Health check
healthy := client.IsHealthy(ctx)

// Query with response
response, err := client.Query(ctx, "query", mcp.QueryModeHybrid)

// Get context only
context, err := client.GetContext(ctx, "query", mcp.QueryModeHybrid)

// Insert document
err := client.Insert(ctx, "text content", "description")

Integration Points

Runtime Initialization

LightRAG is initialized automatically in internal/runtime/shared.go:

// Line 685-704
if cfg.LightRAG.Enabled {
    lightragConfig := mcp.LightRAGConfig{
        BaseURL: cfg.LightRAG.BaseURL,
        Timeout: cfg.LightRAG.Timeout,
        APIKey:  cfg.LightRAG.APIKey,
    }
    lightragClient := mcp.NewLightRAGClient(lightragConfig)

    if lightragClient.IsHealthy(ctx) {
        reasoning.SetLightRAGClient(lightragClient)
        logger.Info("📚 LightRAG RAG system enabled")
    }
}

Graceful Degradation

LightRAG integration is completely optional and non-blocking:

  • If CHORUS_LIGHTRAG_ENABLED=false, no LightRAG calls are made
  • If LightRAG server is unavailable, health check fails gracefully
  • If RAG queries fail, reasoning engine falls back to non-RAG generation
  • SLURP enrichment failures are logged but don't block context resolution

Testing

Unit Tests

# Run all LightRAG tests (requires running server)
go test -v ./pkg/mcp/

# Run only unit tests (no server required)
go test -v -short ./pkg/mcp/

Integration Tests

# Start LightRAG server
cd ~/chorus/mcp-include/LightRAG
python main.py

# Run integration tests
cd ~/chorus/project-queues/active/CHORUS
go test -v ./pkg/mcp/ -run TestLightRAGClient

Performance Considerations

Query Timeouts

  • Default: 30 seconds
  • Hybrid mode is slowest (analyzes entire graph)
  • Naive mode is fastest (simple semantic search)

Caching

LightRAG includes internal caching:

  • Repeated queries return cached results
  • Cache TTL managed by LightRAG server
  • No CHORUS-side caching required

Resource Usage

  • Memory: Proportional to knowledge base size
  • CPU: Query modes have different compute requirements
  • Network: HTTP requests to LightRAG server

Troubleshooting

Server Not Healthy

Symptom: LightRAG enabled but server not healthy

Solutions:

  1. Check if LightRAG server is running: curl http://127.0.0.1:9621/health
  2. Verify correct port in CHORUS_LIGHTRAG_BASE_URL
  3. Check LightRAG logs for errors
  4. Ensure network connectivity between CHORUS and LightRAG

Empty Responses

Symptom: RAG queries return empty results

Solutions:

  1. Knowledge base may be empty - insert documents first
  2. Query may not match indexed content
  3. Try different query mode (hybrid recommended)
  4. Check LightRAG indexing logs

Timeout Errors

Symptom: context deadline exceeded

Solutions:

  1. Increase CHORUS_LIGHTRAG_TIMEOUT
  2. Use faster query mode (naive or local)
  3. Optimize LightRAG server performance
  4. Check network latency

Security Considerations

API Authentication

Optional API key support:

CHORUS_LIGHTRAG_API_KEY=your-secret-key

Keys are sent as Bearer tokens in Authorization header.

Network Security

  • Run LightRAG on internal network only
  • Use HTTPS for production deployments
  • Consider firewall rules to restrict access
  • LightRAG doesn't include built-in encryption

Data Privacy

  • All queries and documents are stored in LightRAG
  • Consider what data is being indexed
  • Implement data retention policies
  • Use access control on LightRAG server

Monitoring

Health Checks

// Check LightRAG availability
if client.IsHealthy(ctx) {
    // Server is healthy
}

// Get detailed health info
health, err := client.Health(ctx)
// Returns: Status, CoreVersion, APIVersion, etc.

Metrics

Consider adding:

  • RAG query latency
  • Cache hit rates
  • Enrichment success/failure rates
  • Knowledge base size

Future Enhancements

Potential improvements:

  1. Batch Query Optimization

    • Batch multiple RAG queries together
    • Reduce HTTP overhead
  2. Adaptive Query Mode Selection

    • Automatically choose query mode based on question type
    • Learn from past query performance
  3. Knowledge Base Management

    • Automated document insertion from SLURP contexts
    • Background indexing of code repositories
    • Scheduled knowledge base updates
  4. Advanced Caching

    • CHORUS-side caching with TTL
    • Semantic cache (similar queries share cache)
    • Persistent cache across restarts
  5. Multi-tenant Support

    • Per-agent knowledge bases
    • Role-based access to documents
    • Encrypted knowledge storage

Files Changed

  1. pkg/mcp/lightrag_client.go - NEW (277 lines)
  2. pkg/mcp/lightrag_client_test.go - NEW (239 lines)
  3. pkg/config/config.go - Modified (added LightRAGConfig)
  4. reasoning/reasoning.go - Modified (added RAG functions)
  5. internal/runtime/shared.go - Modified (added initialization)
  6. pkg/slurp/context/lightrag.go - NEW (203 lines)

Total: 3 new files, 3 modified files, ~750 lines of code

References


Maintainer: CHORUS Project Team Last Updated: 2025-09-30 Status: Production Ready