Implement Beat 1: Sequential Thinking Age-Encrypted Wrapper (Skeleton)

This commit completes Beat 1 of the SequentialThinkingForCHORUS implementation, providing a functional plaintext skeleton for the age-encrypted wrapper. ## Deliverables ### 1. Main Wrapper Entry Point - `cmd/seqthink-wrapper/main.go`: HTTP server on :8443 - Configuration loading from environment variables - Graceful shutdown handling - MCP server readiness checking with timeout ### 2. MCP Client Package - `pkg/seqthink/mcpclient/client.go`: HTTP client for MCP server - Communicates with MCP server on localhost:8000 - Health check endpoint - Tool call endpoint with 120s timeout ### 3. Proxy Server Package - `pkg/seqthink/proxy/server.go`: HTTP handlers for wrapper - Health and readiness endpoints - Tool call proxy (plaintext for Beat 1) - SSE endpoint placeholder - Metrics endpoint integration ### 4. Observability Package - `pkg/seqthink/observability/logger.go`: Structured logging with zerolog - `pkg/seqthink/observability/metrics.go`: Prometheus metrics - Counters for requests, errors, decrypt/encrypt failures, policy denials - Request duration histogram ### 5. Docker Infrastructure - `deploy/seqthink/Dockerfile`: Multi-stage build - `deploy/seqthink/entrypoint.sh`: Startup orchestration - `deploy/seqthink/mcp_stub.py`: Minimal MCP server for testing ### 6. Build System Integration - Updated `Makefile` with `build-seqthink` target - Uses GOWORK=off and -mod=mod for clean builds - `docker-seqthink` target for container builds ## Testing Successfully builds with: ``` make build-seqthink ``` Binary successfully starts and waits for MCP server connection. ## Next Steps Beat 2 will add: - Age encryption/decryption (pkg/seqthink/ageio) - Content-Type: application/age enforcement - SSE streaming with encrypted frames - Golden tests for crypto round-trips 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-13 08:35:43 +11:00
parent dd8be05e9c
commit 3ce9811826
11 changed files with 2424 additions and 9 deletions
--- a/docs/SEQUENTIAL-THINKING-INTEGRATION-PLAN.md
+++ b/docs/SEQUENTIAL-THINKING-INTEGRATION-PLAN.md
@@ -0,0 +1,579 @@
+# Sequential Thinking Integration Plan for CHORUS Agents
+
+**Date**: 2025-10-13
+**Status**: Design Phase
+**Priority**: High - Blocking further intelligence improvements
+
+---
+
+## Executive Summary
+
+This document outlines the integration of the Sequential Thinking MCP server into CHORUS agents to enable **structured, multi-step reasoning** before task execution. This addresses the limitation in the SequentialThinkingForCHORUS repository issue and unlocks advanced agent decision-making capabilities.
+
+**Problem Statement**: CHORUS agents currently use simple prompt-response cycles without structured reasoning, limiting their ability to handle complex tasks requiring multi-step analysis, hypothesis generation, and iterative refinement.
+
+**Solution**: Integrate the `mcp__sequential-thinking__sequentialthinking` MCP tool into the AI provider layer to enable chain-of-thought reasoning for complex tasks.
+
+---
+
+## Current Architecture Analysis
+
+### 1. Existing AI Provider Flow
+
+```
+TaskRequest → ModelProvider.ExecuteTask() → TaskResponse
+                    ↓
+          [Single LLM Call]
+                    ↓
+            Response String
+```
+
+**Current Providers**:
+- **OllamaProvider**: Local model execution
+- **ResetDataProvider**: ResetData LaaS API
+- **OpenAIProvider**: OpenAI API
+
+**Current Limitations**:
+- ✗ No structured reasoning process
+- ✗ No ability to revise initial thoughts
+- ✗ No hypothesis generation and verification
+- ✗ No branching for alternative approaches
+- ✗ Simple string reasoning field (not structured)
+
+### 2. TaskResponse Structure
+
+**Location**: `/home/tony/chorus/project-queues/active/CHORUS/pkg/ai/provider.go:53-78`
+
+```go
+type TaskResponse struct {
+    Success      bool                   `json:"success"`
+    TaskID       string                 `json:"task_id"`
+    Response     string                 `json:"response"`
+    Reasoning    string                 `json:"reasoning,omitempty"`  // ← Simple string
+    Actions      []TaskAction           `json:"actions,omitempty"`
+    Artifacts    []Artifact             `json:"artifacts,omitempty"`
+    TokensUsed   TokenUsage             `json:"tokens_used,omitempty"`
+    // ... other fields
+}
+```
+
+**Opportunity**: The `Reasoning` field is perfect for storing structured thinking output!
+
+---
+
+## Sequential Thinking MCP Tool
+
+### Tool Signature
+
+```go
+mcp__sequential-thinking__sequentialthinking(
+    thought: string,
+    nextThoughtNeeded: bool,
+    thoughtNumber: int,
+    totalThoughts: int,
+    isRevision: bool = false,
+    revisesThought: int = null,
+    branchFromThought: int = null,
+    branchId: string = null,
+    needsMoreThoughts: bool = false
+)
+```
+
+### Capabilities
+
+1. **Adaptive Thinking**: Adjust `totalThoughts` up or down as understanding deepens
+2. **Revision Support**: Question and revise previous thoughts (`isRevision`, `revisesThought`)
+3. **Branching**: Explore alternative approaches (`branchFromThought`, `branchId`)
+4. **Hypothesis Testing**: Generate and verify hypotheses in chain-of-thought
+5. **Uncertainty Expression**: Express and work through unclear aspects
+6. **Context Maintenance**: Keep track of all previous thoughts
+
+### When to Use
+
+- **Complex problem decomposition**
+- **Multi-step solution planning**
+- **Problems requiring course correction**
+- **Unclear scope requiring exploration**
+- **Tasks needing context over multiple steps**
+- **Filtering irrelevant information**
+
+---
+
+## Proposed Integration Architecture
+
+### Phase 1: Enhanced TaskResponse Structure
+
+**File**: `pkg/ai/provider.go`
+
+```go
+// StructuredReasoning represents chain-of-thought reasoning process
+type StructuredReasoning struct {
+    Thoughts         []ThoughtStep     `json:"thoughts"`
+    FinalHypothesis  string            `json:"final_hypothesis,omitempty"`
+    VerificationSteps []string         `json:"verification_steps,omitempty"`
+    Confidence       float32           `json:"confidence"`        // 0.0-1.0
+    TotalRevisions   int               `json:"total_revisions"`
+    BranchesExplored int               `json:"branches_explored"`
+}
+
+// ThoughtStep represents a single step in the reasoning process
+type ThoughtStep struct {
+    Number          int               `json:"number"`
+    Content         string            `json:"content"`
+    IsRevision      bool              `json:"is_revision"`
+    RevisesThought  int               `json:"revises_thought,omitempty"`
+    BranchID        string            `json:"branch_id,omitempty"`
+    BranchFrom      int               `json:"branch_from,omitempty"`
+    Timestamp       time.Time         `json:"timestamp"`
+}
+
+// TaskResponse update
+type TaskResponse struct {
+    // ... existing fields ...
+    Reasoning           string               `json:"reasoning,omitempty"` // Legacy simple string
+    StructuredReasoning *StructuredReasoning `json:"structured_reasoning,omitempty"` // NEW
+    // ... rest of fields ...
+}
+```
+
+### Phase 2: Sequential Thinking Wrapper
+
+**New File**: `pkg/ai/sequential_thinking.go`
+
+```go
+package ai
+
+import (
+    "context"
+    "encoding/json"
+    "fmt"
+)
+
+// SequentialThinkingEngine wraps MCP sequential thinking tool
+type SequentialThinkingEngine struct {
+    mcpClient MCPClient // Interface to MCP tool
+}
+
+// ThinkingRequest represents input for sequential thinking
+type ThinkingRequest struct {
+    Problem           string
+    Context           map[string]interface{}
+    MaxThoughts       int
+    AllowRevisions    bool
+    AllowBranching    bool
+}
+
+// ThinkingResult represents output from sequential thinking
+type ThinkingResult struct {
+    Thoughts          []ThoughtStep
+    FinalConclusion   string
+    Confidence        float32
+    ReasoningPath     string // Markdown summary of thinking process
+}
+
+// Think executes sequential thinking process
+func (e *SequentialThinkingEngine) Think(ctx context.Context, req *ThinkingRequest) (*ThinkingResult, error) {
+    // Implementation:
+    // 1. Initialize thinking with problem statement
+    // 2. Iteratively call MCP tool until nextThoughtNeeded = false
+    // 3. Track all thoughts, revisions, branches
+    // 4. Generate final conclusion and reasoning summary
+    // 5. Return structured result
+}
+```
+
+### Phase 3: Provider Integration
+
+**Modified File**: `pkg/ai/resetdata.go`
+
+```go
+// ExecuteTask with sequential thinking
+func (p *ResetDataProvider) ExecuteTask(ctx context.Context, request *TaskRequest) (*TaskResponse, error) {
+    startTime := time.Now()
+
+    // Determine if task requires sequential thinking
+    useSequentialThinking := p.shouldUseSequentialThinking(request)
+
+    var structuredReasoning *StructuredReasoning
+    var enhancedPrompt string
+
+    if useSequentialThinking {
+        // Use sequential thinking engine to analyze task first
+        thinkingEngine := NewSequentialThinkingEngine(p.mcpClient)
+
+        thinkingResult, err := thinkingEngine.Think(ctx, &ThinkingRequest{
+            Problem:        p.formatTaskAsProblem(request),
+            Context:        request.Context,
+            MaxThoughts:    10,
+            AllowRevisions: true,
+            AllowBranching: true,
+        })
+
+        if err != nil {
+            // Fall back to direct execution if thinking fails
+            log.Warn().Err(err).Msg("Sequential thinking failed, falling back to direct execution")
+        } else {
+            // Use thinking result to enhance prompt
+            enhancedPrompt = p.buildPromptWithThinking(request, thinkingResult)
+            structuredReasoning = convertToStructuredReasoning(thinkingResult)
+        }
+    }
+
+    // Execute with enhanced prompt (if available) or standard prompt
+    messages, _ := p.buildChatMessages(request, enhancedPrompt)
+
+    // ... rest of execution ...
+
+    return &TaskResponse{
+        Success:             true,
+        Response:            responseText,
+        Reasoning:           legacyReasoningString,
+        StructuredReasoning: structuredReasoning, // NEW
+        // ... rest of response ...
+    }
+}
+
+// shouldUseSequentialThinking determines if task warrants sequential thinking
+func (p *ResetDataProvider) shouldUseSequentialThinking(request *TaskRequest) bool {
+    // Use sequential thinking for:
+    // - High complexity tasks (complexity >= 7)
+    // - Architect role (requires system design)
+    // - Tasks with "design" or "architecture" in title/labels
+    // - Tasks requiring multi-step planning
+
+    if request.Complexity >= 7 {
+        return true
+    }
+
+    role := strings.ToLower(request.AgentRole)
+    if role == "architect" || role == "senior-developer" {
+        return true
+    }
+
+    keywords := []string{"design", "architecture", "refactor", "plan", "strategy"}
+    taskText := strings.ToLower(request.TaskTitle + " " + request.TaskDescription)
+    for _, keyword := range keywords {
+        if strings.Contains(taskText, keyword) {
+            return true
+        }
+    }
+
+    return false
+}
+```
+
+---
+
+## Implementation Phases
+
+### Phase 1: Foundation (Days 1-2)
+
+**Tasks**:
+1. ✅ Define `StructuredReasoning` and `ThoughtStep` types
+2. ✅ Add `StructuredReasoning` field to `TaskResponse`
+3. ✅ Create `SequentialThinkingEngine` skeleton
+4. ✅ Add MCP client interface for sequential-thinking tool
+
+**Files to Create/Modify**:
+- `pkg/ai/provider.go` - Add new types
+- `pkg/ai/sequential_thinking.go` - New file
+- `pkg/ai/mcp_client.go` - New file for MCP integration
+
+**Success Criteria**:
+- Code compiles without errors
+- Types are properly defined
+- MCP client interface is clear
+
+### Phase 2: Sequential Thinking Engine (Days 3-5)
+
+**Tasks**:
+1. Implement `SequentialThinkingEngine.Think()` method
+2. Implement MCP tool call wrapper
+3. Add thought tracking and revision detection
+4. Implement branch management
+5. Generate reasoning summaries
+6. Write unit tests
+
+**Files**:
+- `pkg/ai/sequential_thinking.go` - Full implementation
+- `pkg/ai/sequential_thinking_test.go` - Unit tests
+
+**Success Criteria**:
+- Can execute complete thinking cycles
+- Properly tracks revisions and branches
+- Generates clear reasoning summaries
+- All unit tests pass
+
+### Phase 3: Provider Integration (Days 6-8)
+
+**Tasks**:
+1. Modify `ResetDataProvider.ExecuteTask()` for sequential thinking
+2. Implement `shouldUseSequentialThinking()` heuristics
+3. Add prompt enhancement with thinking results
+4. Implement fallback for thinking failures
+5. Add configuration options
+6. Write integration tests
+
+**Files**:
+- `pkg/ai/resetdata.go` - Modify ExecuteTask
+- `pkg/ai/ollama.go` - Same modifications
+- `config/agent.yaml` - Add sequential thinking config
+
+**Success Criteria**:
+- Complex tasks trigger sequential thinking
+- Thinking results enhance task execution
+- Graceful fallback on failures
+- Integration tests pass
+
+### Phase 4: Testing & Validation (Days 9-10)
+
+**Tasks**:
+1. End-to-end testing with real councils
+2. Test with various complexity levels
+3. Validate reasoning quality improvements
+4. Performance benchmarking
+5. Documentation updates
+
+**Test Cases**:
+- Simple task (complexity=3) → No sequential thinking
+- Complex task (complexity=8) → Sequential thinking enabled
+- Architect role → Always uses sequential thinking
+- Design task → Sequential thinking with branching
+- Fallback scenario → Graceful degradation
+
+**Success Criteria**:
+- Demonstrable improvement in task quality
+- Acceptable performance overhead (<30% increase in latency)
+- Clear reasoning traces in artifacts
+- Documentation complete
+
+---
+
+## Configuration
+
+### Agent Configuration
+
+**File**: `config/agent.yaml`
+
+```yaml
+ai_providers:
+  resetdata:
+    type: "resetdata"
+    endpoint: "${RESETDATA_API_ENDPOINT}"
+    api_key: "${RESETDATA_API_KEY}"
+    default_model: "llama3.1:70b"
+
+    # Sequential thinking configuration
+    enable_sequential_thinking: true
+    sequential_thinking:
+      min_complexity: 7           # Minimum complexity to trigger
+      force_for_roles:            # Always use for these roles
+        - architect
+        - senior-developer
+      max_thoughts: 15            # Maximum thinking iterations
+      enable_revisions: true      # Allow thought revisions
+      enable_branching: true      # Allow exploring alternatives
+      confidence_threshold: 0.7   # Minimum confidence for final answer
+```
+
+### Runtime Toggle
+
+Allow runtime control via council brief:
+
+```json
+{
+  "task_id": "task-123",
+  "complexity": 8,
+  "use_sequential_thinking": true,  // Explicit override
+  "thinking_config": {
+    "max_thoughts": 20,
+    "allow_branching": true
+  }
+}
+```
+
+---
+
+## Benefits & Expected Improvements
+
+### 1. Better Problem Decomposition
+
+**Before**:
+```
+Agent: Here's my solution [immediately provides implementation]
+```
+
+**After**:
+```
+Thought 1: Breaking down the task into 3 main components...
+Thought 2: Component A requires database schema changes...
+Thought 3: Wait, revising thought 2 - migration strategy needs consideration...
+Thought 4: Exploring alternative: event sourcing vs direct updates...
+Thought 5: Event sourcing better for audit trail requirements...
+Final: Implementation plan with 5 concrete steps...
+```
+
+### 2. Improved Architecture Decisions
+
+Architect agents can:
+- Explore multiple design alternatives
+- Revise decisions based on discovered constraints
+- Build and verify hypotheses about scalability
+- Document reasoning trail for future reference
+
+### 3. Higher Quality Code
+
+Developer agents can:
+- Think through edge cases before coding
+- Consider multiple implementation approaches
+- Revise initial assumptions
+- Plan testing strategy upfront
+
+### 4. Debugging Enhancement
+
+When tasks fail:
+- Reasoning traces show where agent went wrong
+- Can identify flawed assumptions
+- Easier to improve prompts and heuristics
+
+---
+
+## Performance Considerations
+
+### 1. Latency Impact
+
+**Estimated Overhead**:
+- Sequential thinking: 5-15 LLM calls (vs 1 direct call)
+- Expected latency increase: 10-30 seconds for complex tasks
+- **Mitigation**: Only use for high-complexity tasks (complexity >= 7)
+
+### 2. Token Usage
+
+**Estimated Increase**:
+- Each thought: ~200-500 tokens
+- 10 thoughts: ~3000-5000 additional tokens
+- **Mitigation**: Set reasonable `max_thoughts` limits
+
+### 3. Resource Requirements
+
+**MCP Server**:
+- Sequential thinking MCP server must be available
+- Requires proper error handling and fallback
+
+---
+
+## Risks & Mitigations
+
+| Risk | Impact | Mitigation |
+|------|--------|------------|
+| MCP server unavailable | High | Graceful fallback to direct execution |
+| Increased latency unacceptable | Medium | Make sequential thinking opt-in per task |
+| Token cost explosion | Medium | Set hard limits on max_thoughts |
+| Reasoning doesn't improve quality | High | A/B testing with metrics |
+| Complex implementation | Medium | Phased rollout with testing |
+
+---
+
+## Success Metrics
+
+### Quantitative
+
+1. **Task Success Rate**: Compare before/after for complexity >= 7 tasks
+   - Target: +15% improvement
+2. **Code Quality**: Static analysis scores for generated code
+   - Target: +20% improvement in complexity score
+3. **PR Acceptance Rate**: How many agent PRs get merged
+   - Target: +25% improvement
+4. **Latency**: Task execution time
+   - Acceptable: <30% increase for complex tasks
+
+### Qualitative
+
+1. **Reasoning Quality**: Human review of reasoning traces
+2. **Decision Clarity**: Can humans understand agent's thought process?
+3. **Developer Feedback**: Easier to debug failed tasks?
+
+---
+
+## Rollout Plan
+
+### Stage 1: Internal Testing (Week 1)
+
+- Deploy to development environment
+- Test with synthetic tasks
+- Gather performance metrics
+- Refine heuristics
+
+### Stage 2: Limited Production (Week 2)
+
+- Enable for architect role only
+- Enable for complexity >= 9 only
+- Monitor closely
+- Collect feedback
+
+### Stage 3: Expanded Rollout (Week 3-4)
+
+- Enable for all roles with complexity >= 7
+- Add complexity-based opt-in
+- Full production deployment
+- Continuous monitoring
+
+### Stage 4: Optimization (Week 5+)
+
+- Fine-tune heuristics based on data
+- Optimize thought limits
+- Improve reasoning summaries
+- Add advanced features (e.g., multi-agent reasoning)
+
+---
+
+## Future Enhancements
+
+### 1. Multi-Agent Reasoning
+
+Multiple agents can contribute thoughts to same reasoning chain:
+- Architect proposes design
+- Security agent reviews security implications
+- Performance agent analyzes scalability
+
+### 2. Reasoning Templates
+
+Pre-defined thinking patterns for common scenarios:
+- API design checklist
+- Security review framework
+- Performance optimization workflow
+
+### 3. Learning from Reasoning
+
+Store successful reasoning patterns:
+- Build knowledge base of good reasoning traces
+- Use as examples in future tasks
+- Identify common pitfalls
+
+### 4. Visualization
+
+Dashboard showing reasoning graphs:
+- Thought flow diagrams
+- Revision history
+- Branch exploration trees
+- Confidence evolution
+
+---
+
+## References
+
+- **SequentialThinkingForCHORUS Issue**: (Repository in GITEA)
+- **MCP Sequential Thinking Tool**: Available in Claude Code MCP servers
+- **CHORUS Task Execution**: `/home/tony/chorus/project-queues/active/CHORUS/pkg/execution/engine.go`
+- **AI Provider Interface**: `/home/tony/chorus/project-queues/active/CHORUS/pkg/ai/provider.go`
+- **ResetData Provider**: `/home/tony/chorus/project-queues/active/CHORUS/pkg/ai/resetdata.go`
+
+---
+
+## Document Info
+
+- **Created**: 2025-10-13
+- **Author**: Claude Code
+- **Status**: Design Complete - Ready for Implementation
+- **Next Steps**: Begin Phase 1 implementation
+