🚀 Generated with Claude Code - Project plan and architecture documentation - Python package structure with core modules - API design and basic usage examples - Development environment configuration - Literature review and research foundation Ready for Phase 1 implementation. Co-Authored-By: Claude <noreply@anthropic.com>
4.7 KiB
HCFS Architecture
Overview
The Hierarchical Context File System (HCFS) is designed as a layered architecture that bridges filesystem navigation with semantic context storage and retrieval.
System Components
1. Virtual Filesystem Layer (src/hcfs/filesystem/)
The virtual filesystem presents a standard POSIX-like directory structure backed by context blobs rather than traditional files.
Key Components:
- HCFSFilesystem: Main filesystem interface
- HCFSFuseOperations: FUSE-based filesystem operations (readdir, getattr, etc.)
Responsibilities:
- Present hierarchical path structure to agents
- Map filesystem operations to context queries
- Handle path-based navigation (
cd,ls, etc.) - Maintain current context scope per session
2. Storage Backend (src/hcfs/storage/)
The storage layer manages persistent context blob storage with versioning and metadata.
Key Components:
- ContextStorage: Abstract storage interface
- SQLiteBackend: SQLite-based implementation
- StoredContextBlob: Storage data models
- ContextMetadata: Metadata and versioning
Responsibilities:
- Persist context blobs with versioning
- Store path-to-context mappings
- Manage hierarchical inheritance relationships
- Provide ACID guarantees for context operations
3. Indexing & Semantic Search (src/hcfs/indexing/)
The indexing layer provides semantic search capabilities over context blobs.
Key Components:
- EmbeddingEngine: Generate embeddings for context content
- SemanticSearch: Vector similarity search
- HybridRanker: Combines BM25 + embedding scores
Responsibilities:
- Generate and store embeddings for context blobs
- Provide semantic similarity search
- Rank results by relevance (hybrid BM25 + vector)
- Support context folding and summarization
4. Agent API (src/hcfs/api/)
The API layer exposes syscall-style functions for agent interaction.
Key Components:
- ContextAPI: Main agent-facing API
- ContextBlob: Context data models
- ContextPath: Path representation
- ContextQuery: Query models
Core API Functions:
# Navigation
context_cd(path: str) -> bool
context_pwd() -> str
# Retrieval
context_get(depth: int = 1) -> List[ContextBlob]
context_list(path: str = None) -> List[str]
# Manipulation
context_push(path: str, blob: ContextBlob) -> str
context_delete(path: str, blob_id: str) -> bool
# Subscription
context_subscribe(path: str, callback: Callable) -> str
context_unsubscribe(subscription_id: str) -> bool
5. Utilities (src/hcfs/utils/)
Common utilities and configuration management.
Key Components:
- HCFSConfig: Configuration management
- path_utils: Path manipulation utilities
- logging: Structured logging
Data Flow
Agent → ContextAPI → HCFSFilesystem → ContextStorage
↓ ↓
SemanticSearch ← EmbeddingEngine
Example: Context Retrieval
- Agent calls
context_cd("/project/src/") - ContextAPI validates path and sets current scope
- HCFSFilesystem updates virtual directory state
- Agent calls
context_get(depth=2) - ContextAPI queries ContextStorage for context at
/project/src/and/project/ - SemanticSearch ranks and filters results
- Merged context returned to agent
Example: Context Publishing
- Agent calls
context_push("/project/src/module.py", blob) - ContextAPI validates blob and path
- EmbeddingEngine generates embeddings for blob content
- ContextStorage persists blob with versioning
- Subscription notifications sent to interested agents
Hierarchical Inheritance
Context blobs inherit from parent paths using configurable strategies:
- Append: Child context appends to parent context
- Override: Child context overrides parent context
- Merge: Intelligent merging based on content type
- Isolate: No inheritance, child context standalone
Concurrency & Consistency
- Read Scalability: Multiple agents can read simultaneously
- Write Coordination: Optimistic locking with conflict resolution
- Versioning: All context changes create new versions
- Subscription: Pub/sub notifications for context changes
Performance Considerations
- Caching: LRU cache for frequently accessed contexts
- Lazy Loading: Context blobs loaded on-demand
- Batch Operations: Bulk context operations for efficiency
- Index Optimization: Separate indices for path, content, and metadata queries
Security Model
- Path Permissions: ACL-based access control per path
- Agent Authentication: Token-based agent identification
- Content Validation: Schema validation for context blobs
- Audit Logging: All context operations logged for accountability