# HCFS Architecture ## Overview The Hierarchical Context File System (HCFS) is designed as a layered architecture that bridges filesystem navigation with semantic context storage and retrieval. ## System Components ### 1. Virtual Filesystem Layer (`src/hcfs/filesystem/`) The virtual filesystem presents a standard POSIX-like directory structure backed by context blobs rather than traditional files. **Key Components:** - **HCFSFilesystem**: Main filesystem interface - **HCFSFuseOperations**: FUSE-based filesystem operations (readdir, getattr, etc.) **Responsibilities:** - Present hierarchical path structure to agents - Map filesystem operations to context queries - Handle path-based navigation (`cd`, `ls`, etc.) - Maintain current context scope per session ### 2. Storage Backend (`src/hcfs/storage/`) The storage layer manages persistent context blob storage with versioning and metadata. **Key Components:** - **ContextStorage**: Abstract storage interface - **SQLiteBackend**: SQLite-based implementation - **StoredContextBlob**: Storage data models - **ContextMetadata**: Metadata and versioning **Responsibilities:** - Persist context blobs with versioning - Store path-to-context mappings - Manage hierarchical inheritance relationships - Provide ACID guarantees for context operations ### 3. Indexing & Semantic Search (`src/hcfs/indexing/`) The indexing layer provides semantic search capabilities over context blobs. **Key Components:** - **EmbeddingEngine**: Generate embeddings for context content - **SemanticSearch**: Vector similarity search - **HybridRanker**: Combines BM25 + embedding scores **Responsibilities:** - Generate and store embeddings for context blobs - Provide semantic similarity search - Rank results by relevance (hybrid BM25 + vector) - Support context folding and summarization ### 4. Agent API (`src/hcfs/api/`) The API layer exposes syscall-style functions for agent interaction. **Key Components:** - **ContextAPI**: Main agent-facing API - **ContextBlob**: Context data models - **ContextPath**: Path representation - **ContextQuery**: Query models **Core API Functions:** ```python # Navigation context_cd(path: str) -> bool context_pwd() -> str # Retrieval context_get(depth: int = 1) -> List[ContextBlob] context_list(path: str = None) -> List[str] # Manipulation context_push(path: str, blob: ContextBlob) -> str context_delete(path: str, blob_id: str) -> bool # Subscription context_subscribe(path: str, callback: Callable) -> str context_unsubscribe(subscription_id: str) -> bool ``` ### 5. Utilities (`src/hcfs/utils/`) Common utilities and configuration management. **Key Components:** - **HCFSConfig**: Configuration management - **path_utils**: Path manipulation utilities - **logging**: Structured logging ## Data Flow ``` Agent → ContextAPI → HCFSFilesystem → ContextStorage ↓ ↓ SemanticSearch ← EmbeddingEngine ``` ### Example: Context Retrieval 1. Agent calls `context_cd("/project/src/")` 2. ContextAPI validates path and sets current scope 3. HCFSFilesystem updates virtual directory state 4. Agent calls `context_get(depth=2)` 5. ContextAPI queries ContextStorage for context at `/project/src/` and `/project/` 6. SemanticSearch ranks and filters results 7. Merged context returned to agent ### Example: Context Publishing 1. Agent calls `context_push("/project/src/module.py", blob)` 2. ContextAPI validates blob and path 3. EmbeddingEngine generates embeddings for blob content 4. ContextStorage persists blob with versioning 5. Subscription notifications sent to interested agents ## Hierarchical Inheritance Context blobs inherit from parent paths using configurable strategies: - **Append**: Child context appends to parent context - **Override**: Child context overrides parent context - **Merge**: Intelligent merging based on content type - **Isolate**: No inheritance, child context standalone ## Concurrency & Consistency - **Read Scalability**: Multiple agents can read simultaneously - **Write Coordination**: Optimistic locking with conflict resolution - **Versioning**: All context changes create new versions - **Subscription**: Pub/sub notifications for context changes ## Performance Considerations - **Caching**: LRU cache for frequently accessed contexts - **Lazy Loading**: Context blobs loaded on-demand - **Batch Operations**: Bulk context operations for efficiency - **Index Optimization**: Separate indices for path, content, and metadata queries ## Security Model - **Path Permissions**: ACL-based access control per path - **Agent Authentication**: Token-based agent identification - **Content Validation**: Schema validation for context blobs - **Audit Logging**: All context operations logged for accountability