Files
HCFS/docs/ARCHITECTURE.md
Claude Code 8f19eaab25 Initial HCFS project scaffold
🚀 Generated with Claude Code

- Project plan and architecture documentation
- Python package structure with core modules
- API design and basic usage examples
- Development environment configuration
- Literature review and research foundation

Ready for Phase 1 implementation.

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-29 12:13:16 +10:00

4.7 KiB

HCFS Architecture

Overview

The Hierarchical Context File System (HCFS) is designed as a layered architecture that bridges filesystem navigation with semantic context storage and retrieval.

System Components

1. Virtual Filesystem Layer (src/hcfs/filesystem/)

The virtual filesystem presents a standard POSIX-like directory structure backed by context blobs rather than traditional files.

Key Components:

  • HCFSFilesystem: Main filesystem interface
  • HCFSFuseOperations: FUSE-based filesystem operations (readdir, getattr, etc.)

Responsibilities:

  • Present hierarchical path structure to agents
  • Map filesystem operations to context queries
  • Handle path-based navigation (cd, ls, etc.)
  • Maintain current context scope per session

2. Storage Backend (src/hcfs/storage/)

The storage layer manages persistent context blob storage with versioning and metadata.

Key Components:

  • ContextStorage: Abstract storage interface
  • SQLiteBackend: SQLite-based implementation
  • StoredContextBlob: Storage data models
  • ContextMetadata: Metadata and versioning

Responsibilities:

  • Persist context blobs with versioning
  • Store path-to-context mappings
  • Manage hierarchical inheritance relationships
  • Provide ACID guarantees for context operations

3. Indexing & Semantic Search (src/hcfs/indexing/)

The indexing layer provides semantic search capabilities over context blobs.

Key Components:

  • EmbeddingEngine: Generate embeddings for context content
  • SemanticSearch: Vector similarity search
  • HybridRanker: Combines BM25 + embedding scores

Responsibilities:

  • Generate and store embeddings for context blobs
  • Provide semantic similarity search
  • Rank results by relevance (hybrid BM25 + vector)
  • Support context folding and summarization

4. Agent API (src/hcfs/api/)

The API layer exposes syscall-style functions for agent interaction.

Key Components:

  • ContextAPI: Main agent-facing API
  • ContextBlob: Context data models
  • ContextPath: Path representation
  • ContextQuery: Query models

Core API Functions:

# Navigation
context_cd(path: str) -> bool
context_pwd() -> str

# Retrieval  
context_get(depth: int = 1) -> List[ContextBlob]
context_list(path: str = None) -> List[str]

# Manipulation
context_push(path: str, blob: ContextBlob) -> str
context_delete(path: str, blob_id: str) -> bool

# Subscription
context_subscribe(path: str, callback: Callable) -> str
context_unsubscribe(subscription_id: str) -> bool

5. Utilities (src/hcfs/utils/)

Common utilities and configuration management.

Key Components:

  • HCFSConfig: Configuration management
  • path_utils: Path manipulation utilities
  • logging: Structured logging

Data Flow

Agent → ContextAPI → HCFSFilesystem → ContextStorage
                  ↓                 ↓
              SemanticSearch ← EmbeddingEngine

Example: Context Retrieval

  1. Agent calls context_cd("/project/src/")
  2. ContextAPI validates path and sets current scope
  3. HCFSFilesystem updates virtual directory state
  4. Agent calls context_get(depth=2)
  5. ContextAPI queries ContextStorage for context at /project/src/ and /project/
  6. SemanticSearch ranks and filters results
  7. Merged context returned to agent

Example: Context Publishing

  1. Agent calls context_push("/project/src/module.py", blob)
  2. ContextAPI validates blob and path
  3. EmbeddingEngine generates embeddings for blob content
  4. ContextStorage persists blob with versioning
  5. Subscription notifications sent to interested agents

Hierarchical Inheritance

Context blobs inherit from parent paths using configurable strategies:

  • Append: Child context appends to parent context
  • Override: Child context overrides parent context
  • Merge: Intelligent merging based on content type
  • Isolate: No inheritance, child context standalone

Concurrency & Consistency

  • Read Scalability: Multiple agents can read simultaneously
  • Write Coordination: Optimistic locking with conflict resolution
  • Versioning: All context changes create new versions
  • Subscription: Pub/sub notifications for context changes

Performance Considerations

  • Caching: LRU cache for frequently accessed contexts
  • Lazy Loading: Context blobs loaded on-demand
  • Batch Operations: Bulk context operations for efficiency
  • Index Optimization: Separate indices for path, content, and metadata queries

Security Model

  • Path Permissions: ACL-based access control per path
  • Agent Authentication: Token-based agent identification
  • Content Validation: Schema validation for context blobs
  • Audit Logging: All context operations logged for accountability