Implement Beat 1: Sequential Thinking Age-Encrypted Wrapper (Skeleton)

This commit completes Beat 1 of the SequentialThinkingForCHORUS implementation,
providing a functional plaintext skeleton for the age-encrypted wrapper.

## Deliverables

### 1. Main Wrapper Entry Point
- `cmd/seqthink-wrapper/main.go`: HTTP server on :8443
- Configuration loading from environment variables
- Graceful shutdown handling
- MCP server readiness checking with timeout

### 2. MCP Client Package
- `pkg/seqthink/mcpclient/client.go`: HTTP client for MCP server
- Communicates with MCP server on localhost:8000
- Health check endpoint
- Tool call endpoint with 120s timeout

### 3. Proxy Server Package
- `pkg/seqthink/proxy/server.go`: HTTP handlers for wrapper
- Health and readiness endpoints
- Tool call proxy (plaintext for Beat 1)
- SSE endpoint placeholder
- Metrics endpoint integration

### 4. Observability Package
- `pkg/seqthink/observability/logger.go`: Structured logging with zerolog
- `pkg/seqthink/observability/metrics.go`: Prometheus metrics
- Counters for requests, errors, decrypt/encrypt failures, policy denials
- Request duration histogram

### 5. Docker Infrastructure
- `deploy/seqthink/Dockerfile`: Multi-stage build
- `deploy/seqthink/entrypoint.sh`: Startup orchestration
- `deploy/seqthink/mcp_stub.py`: Minimal MCP server for testing

### 6. Build System Integration
- Updated `Makefile` with `build-seqthink` target
- Uses GOWORK=off and -mod=mod for clean builds
- `docker-seqthink` target for container builds

## Testing

Successfully builds with:
```
make build-seqthink
```

Binary successfully starts and waits for MCP server connection.

## Next Steps

Beat 2 will add:
- Age encryption/decryption (pkg/seqthink/ageio)
- Content-Type: application/age enforcement
- SSE streaming with encrypted frames
- Golden tests for crypto round-trips

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
anthonyrawlins
2025-10-13 08:35:43 +11:00
parent dd8be05e9c
commit 3ce9811826
11 changed files with 2424 additions and 9 deletions

View File

@@ -0,0 +1,579 @@
# Sequential Thinking Integration Plan for CHORUS Agents
**Date**: 2025-10-13
**Status**: Design Phase
**Priority**: High - Blocking further intelligence improvements
---
## Executive Summary
This document outlines the integration of the Sequential Thinking MCP server into CHORUS agents to enable **structured, multi-step reasoning** before task execution. This addresses the limitation in the SequentialThinkingForCHORUS repository issue and unlocks advanced agent decision-making capabilities.
**Problem Statement**: CHORUS agents currently use simple prompt-response cycles without structured reasoning, limiting their ability to handle complex tasks requiring multi-step analysis, hypothesis generation, and iterative refinement.
**Solution**: Integrate the `mcp__sequential-thinking__sequentialthinking` MCP tool into the AI provider layer to enable chain-of-thought reasoning for complex tasks.
---
## Current Architecture Analysis
### 1. Existing AI Provider Flow
```
TaskRequest → ModelProvider.ExecuteTask() → TaskResponse
[Single LLM Call]
Response String
```
**Current Providers**:
- **OllamaProvider**: Local model execution
- **ResetDataProvider**: ResetData LaaS API
- **OpenAIProvider**: OpenAI API
**Current Limitations**:
- ✗ No structured reasoning process
- ✗ No ability to revise initial thoughts
- ✗ No hypothesis generation and verification
- ✗ No branching for alternative approaches
- ✗ Simple string reasoning field (not structured)
### 2. TaskResponse Structure
**Location**: `/home/tony/chorus/project-queues/active/CHORUS/pkg/ai/provider.go:53-78`
```go
type TaskResponse struct {
Success bool `json:"success"`
TaskID string `json:"task_id"`
Response string `json:"response"`
Reasoning string `json:"reasoning,omitempty"` // ← Simple string
Actions []TaskAction `json:"actions,omitempty"`
Artifacts []Artifact `json:"artifacts,omitempty"`
TokensUsed TokenUsage `json:"tokens_used,omitempty"`
// ... other fields
}
```
**Opportunity**: The `Reasoning` field is perfect for storing structured thinking output!
---
## Sequential Thinking MCP Tool
### Tool Signature
```go
mcp__sequential-thinking__sequentialthinking(
thought: string,
nextThoughtNeeded: bool,
thoughtNumber: int,
totalThoughts: int,
isRevision: bool = false,
revisesThought: int = null,
branchFromThought: int = null,
branchId: string = null,
needsMoreThoughts: bool = false
)
```
### Capabilities
1. **Adaptive Thinking**: Adjust `totalThoughts` up or down as understanding deepens
2. **Revision Support**: Question and revise previous thoughts (`isRevision`, `revisesThought`)
3. **Branching**: Explore alternative approaches (`branchFromThought`, `branchId`)
4. **Hypothesis Testing**: Generate and verify hypotheses in chain-of-thought
5. **Uncertainty Expression**: Express and work through unclear aspects
6. **Context Maintenance**: Keep track of all previous thoughts
### When to Use
- **Complex problem decomposition**
- **Multi-step solution planning**
- **Problems requiring course correction**
- **Unclear scope requiring exploration**
- **Tasks needing context over multiple steps**
- **Filtering irrelevant information**
---
## Proposed Integration Architecture
### Phase 1: Enhanced TaskResponse Structure
**File**: `pkg/ai/provider.go`
```go
// StructuredReasoning represents chain-of-thought reasoning process
type StructuredReasoning struct {
Thoughts []ThoughtStep `json:"thoughts"`
FinalHypothesis string `json:"final_hypothesis,omitempty"`
VerificationSteps []string `json:"verification_steps,omitempty"`
Confidence float32 `json:"confidence"` // 0.0-1.0
TotalRevisions int `json:"total_revisions"`
BranchesExplored int `json:"branches_explored"`
}
// ThoughtStep represents a single step in the reasoning process
type ThoughtStep struct {
Number int `json:"number"`
Content string `json:"content"`
IsRevision bool `json:"is_revision"`
RevisesThought int `json:"revises_thought,omitempty"`
BranchID string `json:"branch_id,omitempty"`
BranchFrom int `json:"branch_from,omitempty"`
Timestamp time.Time `json:"timestamp"`
}
// TaskResponse update
type TaskResponse struct {
// ... existing fields ...
Reasoning string `json:"reasoning,omitempty"` // Legacy simple string
StructuredReasoning *StructuredReasoning `json:"structured_reasoning,omitempty"` // NEW
// ... rest of fields ...
}
```
### Phase 2: Sequential Thinking Wrapper
**New File**: `pkg/ai/sequential_thinking.go`
```go
package ai
import (
"context"
"encoding/json"
"fmt"
)
// SequentialThinkingEngine wraps MCP sequential thinking tool
type SequentialThinkingEngine struct {
mcpClient MCPClient // Interface to MCP tool
}
// ThinkingRequest represents input for sequential thinking
type ThinkingRequest struct {
Problem string
Context map[string]interface{}
MaxThoughts int
AllowRevisions bool
AllowBranching bool
}
// ThinkingResult represents output from sequential thinking
type ThinkingResult struct {
Thoughts []ThoughtStep
FinalConclusion string
Confidence float32
ReasoningPath string // Markdown summary of thinking process
}
// Think executes sequential thinking process
func (e *SequentialThinkingEngine) Think(ctx context.Context, req *ThinkingRequest) (*ThinkingResult, error) {
// Implementation:
// 1. Initialize thinking with problem statement
// 2. Iteratively call MCP tool until nextThoughtNeeded = false
// 3. Track all thoughts, revisions, branches
// 4. Generate final conclusion and reasoning summary
// 5. Return structured result
}
```
### Phase 3: Provider Integration
**Modified File**: `pkg/ai/resetdata.go`
```go
// ExecuteTask with sequential thinking
func (p *ResetDataProvider) ExecuteTask(ctx context.Context, request *TaskRequest) (*TaskResponse, error) {
startTime := time.Now()
// Determine if task requires sequential thinking
useSequentialThinking := p.shouldUseSequentialThinking(request)
var structuredReasoning *StructuredReasoning
var enhancedPrompt string
if useSequentialThinking {
// Use sequential thinking engine to analyze task first
thinkingEngine := NewSequentialThinkingEngine(p.mcpClient)
thinkingResult, err := thinkingEngine.Think(ctx, &ThinkingRequest{
Problem: p.formatTaskAsProblem(request),
Context: request.Context,
MaxThoughts: 10,
AllowRevisions: true,
AllowBranching: true,
})
if err != nil {
// Fall back to direct execution if thinking fails
log.Warn().Err(err).Msg("Sequential thinking failed, falling back to direct execution")
} else {
// Use thinking result to enhance prompt
enhancedPrompt = p.buildPromptWithThinking(request, thinkingResult)
structuredReasoning = convertToStructuredReasoning(thinkingResult)
}
}
// Execute with enhanced prompt (if available) or standard prompt
messages, _ := p.buildChatMessages(request, enhancedPrompt)
// ... rest of execution ...
return &TaskResponse{
Success: true,
Response: responseText,
Reasoning: legacyReasoningString,
StructuredReasoning: structuredReasoning, // NEW
// ... rest of response ...
}
}
// shouldUseSequentialThinking determines if task warrants sequential thinking
func (p *ResetDataProvider) shouldUseSequentialThinking(request *TaskRequest) bool {
// Use sequential thinking for:
// - High complexity tasks (complexity >= 7)
// - Architect role (requires system design)
// - Tasks with "design" or "architecture" in title/labels
// - Tasks requiring multi-step planning
if request.Complexity >= 7 {
return true
}
role := strings.ToLower(request.AgentRole)
if role == "architect" || role == "senior-developer" {
return true
}
keywords := []string{"design", "architecture", "refactor", "plan", "strategy"}
taskText := strings.ToLower(request.TaskTitle + " " + request.TaskDescription)
for _, keyword := range keywords {
if strings.Contains(taskText, keyword) {
return true
}
}
return false
}
```
---
## Implementation Phases
### Phase 1: Foundation (Days 1-2)
**Tasks**:
1. ✅ Define `StructuredReasoning` and `ThoughtStep` types
2. ✅ Add `StructuredReasoning` field to `TaskResponse`
3. ✅ Create `SequentialThinkingEngine` skeleton
4. ✅ Add MCP client interface for sequential-thinking tool
**Files to Create/Modify**:
- `pkg/ai/provider.go` - Add new types
- `pkg/ai/sequential_thinking.go` - New file
- `pkg/ai/mcp_client.go` - New file for MCP integration
**Success Criteria**:
- Code compiles without errors
- Types are properly defined
- MCP client interface is clear
### Phase 2: Sequential Thinking Engine (Days 3-5)
**Tasks**:
1. Implement `SequentialThinkingEngine.Think()` method
2. Implement MCP tool call wrapper
3. Add thought tracking and revision detection
4. Implement branch management
5. Generate reasoning summaries
6. Write unit tests
**Files**:
- `pkg/ai/sequential_thinking.go` - Full implementation
- `pkg/ai/sequential_thinking_test.go` - Unit tests
**Success Criteria**:
- Can execute complete thinking cycles
- Properly tracks revisions and branches
- Generates clear reasoning summaries
- All unit tests pass
### Phase 3: Provider Integration (Days 6-8)
**Tasks**:
1. Modify `ResetDataProvider.ExecuteTask()` for sequential thinking
2. Implement `shouldUseSequentialThinking()` heuristics
3. Add prompt enhancement with thinking results
4. Implement fallback for thinking failures
5. Add configuration options
6. Write integration tests
**Files**:
- `pkg/ai/resetdata.go` - Modify ExecuteTask
- `pkg/ai/ollama.go` - Same modifications
- `config/agent.yaml` - Add sequential thinking config
**Success Criteria**:
- Complex tasks trigger sequential thinking
- Thinking results enhance task execution
- Graceful fallback on failures
- Integration tests pass
### Phase 4: Testing & Validation (Days 9-10)
**Tasks**:
1. End-to-end testing with real councils
2. Test with various complexity levels
3. Validate reasoning quality improvements
4. Performance benchmarking
5. Documentation updates
**Test Cases**:
- Simple task (complexity=3) → No sequential thinking
- Complex task (complexity=8) → Sequential thinking enabled
- Architect role → Always uses sequential thinking
- Design task → Sequential thinking with branching
- Fallback scenario → Graceful degradation
**Success Criteria**:
- Demonstrable improvement in task quality
- Acceptable performance overhead (<30% increase in latency)
- Clear reasoning traces in artifacts
- Documentation complete
---
## Configuration
### Agent Configuration
**File**: `config/agent.yaml`
```yaml
ai_providers:
resetdata:
type: "resetdata"
endpoint: "${RESETDATA_API_ENDPOINT}"
api_key: "${RESETDATA_API_KEY}"
default_model: "llama3.1:70b"
# Sequential thinking configuration
enable_sequential_thinking: true
sequential_thinking:
min_complexity: 7 # Minimum complexity to trigger
force_for_roles: # Always use for these roles
- architect
- senior-developer
max_thoughts: 15 # Maximum thinking iterations
enable_revisions: true # Allow thought revisions
enable_branching: true # Allow exploring alternatives
confidence_threshold: 0.7 # Minimum confidence for final answer
```
### Runtime Toggle
Allow runtime control via council brief:
```json
{
"task_id": "task-123",
"complexity": 8,
"use_sequential_thinking": true, // Explicit override
"thinking_config": {
"max_thoughts": 20,
"allow_branching": true
}
}
```
---
## Benefits & Expected Improvements
### 1. Better Problem Decomposition
**Before**:
```
Agent: Here's my solution [immediately provides implementation]
```
**After**:
```
Thought 1: Breaking down the task into 3 main components...
Thought 2: Component A requires database schema changes...
Thought 3: Wait, revising thought 2 - migration strategy needs consideration...
Thought 4: Exploring alternative: event sourcing vs direct updates...
Thought 5: Event sourcing better for audit trail requirements...
Final: Implementation plan with 5 concrete steps...
```
### 2. Improved Architecture Decisions
Architect agents can:
- Explore multiple design alternatives
- Revise decisions based on discovered constraints
- Build and verify hypotheses about scalability
- Document reasoning trail for future reference
### 3. Higher Quality Code
Developer agents can:
- Think through edge cases before coding
- Consider multiple implementation approaches
- Revise initial assumptions
- Plan testing strategy upfront
### 4. Debugging Enhancement
When tasks fail:
- Reasoning traces show where agent went wrong
- Can identify flawed assumptions
- Easier to improve prompts and heuristics
---
## Performance Considerations
### 1. Latency Impact
**Estimated Overhead**:
- Sequential thinking: 5-15 LLM calls (vs 1 direct call)
- Expected latency increase: 10-30 seconds for complex tasks
- **Mitigation**: Only use for high-complexity tasks (complexity >= 7)
### 2. Token Usage
**Estimated Increase**:
- Each thought: ~200-500 tokens
- 10 thoughts: ~3000-5000 additional tokens
- **Mitigation**: Set reasonable `max_thoughts` limits
### 3. Resource Requirements
**MCP Server**:
- Sequential thinking MCP server must be available
- Requires proper error handling and fallback
---
## Risks & Mitigations
| Risk | Impact | Mitigation |
|------|--------|------------|
| MCP server unavailable | High | Graceful fallback to direct execution |
| Increased latency unacceptable | Medium | Make sequential thinking opt-in per task |
| Token cost explosion | Medium | Set hard limits on max_thoughts |
| Reasoning doesn't improve quality | High | A/B testing with metrics |
| Complex implementation | Medium | Phased rollout with testing |
---
## Success Metrics
### Quantitative
1. **Task Success Rate**: Compare before/after for complexity >= 7 tasks
- Target: +15% improvement
2. **Code Quality**: Static analysis scores for generated code
- Target: +20% improvement in complexity score
3. **PR Acceptance Rate**: How many agent PRs get merged
- Target: +25% improvement
4. **Latency**: Task execution time
- Acceptable: <30% increase for complex tasks
### Qualitative
1. **Reasoning Quality**: Human review of reasoning traces
2. **Decision Clarity**: Can humans understand agent's thought process?
3. **Developer Feedback**: Easier to debug failed tasks?
---
## Rollout Plan
### Stage 1: Internal Testing (Week 1)
- Deploy to development environment
- Test with synthetic tasks
- Gather performance metrics
- Refine heuristics
### Stage 2: Limited Production (Week 2)
- Enable for architect role only
- Enable for complexity >= 9 only
- Monitor closely
- Collect feedback
### Stage 3: Expanded Rollout (Week 3-4)
- Enable for all roles with complexity >= 7
- Add complexity-based opt-in
- Full production deployment
- Continuous monitoring
### Stage 4: Optimization (Week 5+)
- Fine-tune heuristics based on data
- Optimize thought limits
- Improve reasoning summaries
- Add advanced features (e.g., multi-agent reasoning)
---
## Future Enhancements
### 1. Multi-Agent Reasoning
Multiple agents can contribute thoughts to same reasoning chain:
- Architect proposes design
- Security agent reviews security implications
- Performance agent analyzes scalability
### 2. Reasoning Templates
Pre-defined thinking patterns for common scenarios:
- API design checklist
- Security review framework
- Performance optimization workflow
### 3. Learning from Reasoning
Store successful reasoning patterns:
- Build knowledge base of good reasoning traces
- Use as examples in future tasks
- Identify common pitfalls
### 4. Visualization
Dashboard showing reasoning graphs:
- Thought flow diagrams
- Revision history
- Branch exploration trees
- Confidence evolution
---
## References
- **SequentialThinkingForCHORUS Issue**: (Repository in GITEA)
- **MCP Sequential Thinking Tool**: Available in Claude Code MCP servers
- **CHORUS Task Execution**: `/home/tony/chorus/project-queues/active/CHORUS/pkg/execution/engine.go`
- **AI Provider Interface**: `/home/tony/chorus/project-queues/active/CHORUS/pkg/ai/provider.go`
- **ResetData Provider**: `/home/tony/chorus/project-queues/active/CHORUS/pkg/ai/resetdata.go`
---
## Document Info
- **Created**: 2025-10-13
- **Author**: Claude Code
- **Status**: Design Complete - Ready for Implementation
- **Next Steps**: Begin Phase 1 implementation