🚀 Generated with Claude Code - Project plan and architecture documentation - Python package structure with core modules - API design and basic usage examples - Development environment configuration - Literature review and research foundation Ready for Phase 1 implementation. Co-Authored-By: Claude <noreply@anthropic.com>
146 lines
4.7 KiB
Markdown
146 lines
4.7 KiB
Markdown
# HCFS Architecture
|
|
|
|
## Overview
|
|
|
|
The Hierarchical Context File System (HCFS) is designed as a layered architecture that bridges filesystem navigation with semantic context storage and retrieval.
|
|
|
|
## System Components
|
|
|
|
### 1. Virtual Filesystem Layer (`src/hcfs/filesystem/`)
|
|
|
|
The virtual filesystem presents a standard POSIX-like directory structure backed by context blobs rather than traditional files.
|
|
|
|
**Key Components:**
|
|
- **HCFSFilesystem**: Main filesystem interface
|
|
- **HCFSFuseOperations**: FUSE-based filesystem operations (readdir, getattr, etc.)
|
|
|
|
**Responsibilities:**
|
|
- Present hierarchical path structure to agents
|
|
- Map filesystem operations to context queries
|
|
- Handle path-based navigation (`cd`, `ls`, etc.)
|
|
- Maintain current context scope per session
|
|
|
|
### 2. Storage Backend (`src/hcfs/storage/`)
|
|
|
|
The storage layer manages persistent context blob storage with versioning and metadata.
|
|
|
|
**Key Components:**
|
|
- **ContextStorage**: Abstract storage interface
|
|
- **SQLiteBackend**: SQLite-based implementation
|
|
- **StoredContextBlob**: Storage data models
|
|
- **ContextMetadata**: Metadata and versioning
|
|
|
|
**Responsibilities:**
|
|
- Persist context blobs with versioning
|
|
- Store path-to-context mappings
|
|
- Manage hierarchical inheritance relationships
|
|
- Provide ACID guarantees for context operations
|
|
|
|
### 3. Indexing & Semantic Search (`src/hcfs/indexing/`)
|
|
|
|
The indexing layer provides semantic search capabilities over context blobs.
|
|
|
|
**Key Components:**
|
|
- **EmbeddingEngine**: Generate embeddings for context content
|
|
- **SemanticSearch**: Vector similarity search
|
|
- **HybridRanker**: Combines BM25 + embedding scores
|
|
|
|
**Responsibilities:**
|
|
- Generate and store embeddings for context blobs
|
|
- Provide semantic similarity search
|
|
- Rank results by relevance (hybrid BM25 + vector)
|
|
- Support context folding and summarization
|
|
|
|
### 4. Agent API (`src/hcfs/api/`)
|
|
|
|
The API layer exposes syscall-style functions for agent interaction.
|
|
|
|
**Key Components:**
|
|
- **ContextAPI**: Main agent-facing API
|
|
- **ContextBlob**: Context data models
|
|
- **ContextPath**: Path representation
|
|
- **ContextQuery**: Query models
|
|
|
|
**Core API Functions:**
|
|
```python
|
|
# Navigation
|
|
context_cd(path: str) -> bool
|
|
context_pwd() -> str
|
|
|
|
# Retrieval
|
|
context_get(depth: int = 1) -> List[ContextBlob]
|
|
context_list(path: str = None) -> List[str]
|
|
|
|
# Manipulation
|
|
context_push(path: str, blob: ContextBlob) -> str
|
|
context_delete(path: str, blob_id: str) -> bool
|
|
|
|
# Subscription
|
|
context_subscribe(path: str, callback: Callable) -> str
|
|
context_unsubscribe(subscription_id: str) -> bool
|
|
```
|
|
|
|
### 5. Utilities (`src/hcfs/utils/`)
|
|
|
|
Common utilities and configuration management.
|
|
|
|
**Key Components:**
|
|
- **HCFSConfig**: Configuration management
|
|
- **path_utils**: Path manipulation utilities
|
|
- **logging**: Structured logging
|
|
|
|
## Data Flow
|
|
|
|
```
|
|
Agent → ContextAPI → HCFSFilesystem → ContextStorage
|
|
↓ ↓
|
|
SemanticSearch ← EmbeddingEngine
|
|
```
|
|
|
|
### Example: Context Retrieval
|
|
|
|
1. Agent calls `context_cd("/project/src/")`
|
|
2. ContextAPI validates path and sets current scope
|
|
3. HCFSFilesystem updates virtual directory state
|
|
4. Agent calls `context_get(depth=2)`
|
|
5. ContextAPI queries ContextStorage for context at `/project/src/` and `/project/`
|
|
6. SemanticSearch ranks and filters results
|
|
7. Merged context returned to agent
|
|
|
|
### Example: Context Publishing
|
|
|
|
1. Agent calls `context_push("/project/src/module.py", blob)`
|
|
2. ContextAPI validates blob and path
|
|
3. EmbeddingEngine generates embeddings for blob content
|
|
4. ContextStorage persists blob with versioning
|
|
5. Subscription notifications sent to interested agents
|
|
|
|
## Hierarchical Inheritance
|
|
|
|
Context blobs inherit from parent paths using configurable strategies:
|
|
|
|
- **Append**: Child context appends to parent context
|
|
- **Override**: Child context overrides parent context
|
|
- **Merge**: Intelligent merging based on content type
|
|
- **Isolate**: No inheritance, child context standalone
|
|
|
|
## Concurrency & Consistency
|
|
|
|
- **Read Scalability**: Multiple agents can read simultaneously
|
|
- **Write Coordination**: Optimistic locking with conflict resolution
|
|
- **Versioning**: All context changes create new versions
|
|
- **Subscription**: Pub/sub notifications for context changes
|
|
|
|
## Performance Considerations
|
|
|
|
- **Caching**: LRU cache for frequently accessed contexts
|
|
- **Lazy Loading**: Context blobs loaded on-demand
|
|
- **Batch Operations**: Bulk context operations for efficiency
|
|
- **Index Optimization**: Separate indices for path, content, and metadata queries
|
|
|
|
## Security Model
|
|
|
|
- **Path Permissions**: ACL-based access control per path
|
|
- **Agent Authentication**: Token-based agent identification
|
|
- **Content Validation**: Schema validation for context blobs
|
|
- **Audit Logging**: All context operations logged for accountability |