Initial HCFS project scaffold
🚀 Generated with Claude Code - Project plan and architecture documentation - Python package structure with core modules - API design and basic usage examples - Development environment configuration - Literature review and research foundation Ready for Phase 1 implementation. Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
314
docs/API_REFERENCE.md
Normal file
314
docs/API_REFERENCE.md
Normal file
@@ -0,0 +1,314 @@
|
||||
# HCFS API Reference
|
||||
|
||||
## Overview
|
||||
|
||||
The HCFS API provides syscall-style functions for agents to navigate, query, and manipulate hierarchical context. All operations are designed to be familiar to agents accustomed to filesystem operations.
|
||||
|
||||
## Core Navigation API
|
||||
|
||||
### `context_cd(path: str) -> bool`
|
||||
|
||||
Change the current context directory. Similar to the shell `cd` command.
|
||||
|
||||
**Parameters:**
|
||||
- `path`: Target path (absolute or relative)
|
||||
|
||||
**Returns:**
|
||||
- `True` if path exists and is accessible
|
||||
- `False` if path does not exist or is inaccessible
|
||||
|
||||
**Example:**
|
||||
```python
|
||||
# Navigate to project root
|
||||
success = context_cd("/project")
|
||||
|
||||
# Navigate to subdirectory
|
||||
success = context_cd("src/models")
|
||||
|
||||
# Navigate up one level
|
||||
success = context_cd("..")
|
||||
```
|
||||
|
||||
### `context_pwd() -> str`
|
||||
|
||||
Get the current context working directory.
|
||||
|
||||
**Returns:**
|
||||
- Current absolute path as string
|
||||
|
||||
**Example:**
|
||||
```python
|
||||
current_path = context_pwd()
|
||||
# Returns: "/project/src/models"
|
||||
```
|
||||
|
||||
### `context_ls(path: str = None) -> List[str]`
|
||||
|
||||
List available context paths at the specified directory.
|
||||
|
||||
**Parameters:**
|
||||
- `path`: Directory path (default: current directory)
|
||||
|
||||
**Returns:**
|
||||
- List of child path names
|
||||
|
||||
**Example:**
|
||||
```python
|
||||
# List current directory
|
||||
paths = context_ls()
|
||||
# Returns: ["models/", "utils/", "tests/", "README.md"]
|
||||
|
||||
# List specific directory
|
||||
paths = context_ls("/project/docs")
|
||||
# Returns: ["api/", "architecture/", "examples/"]
|
||||
```
|
||||
|
||||
## Context Retrieval API
|
||||
|
||||
### `context_get(depth: int = 1, filters: dict = None) -> List[ContextBlob]`
|
||||
|
||||
Retrieve context blobs from current path and optionally parent paths.
|
||||
|
||||
**Parameters:**
|
||||
- `depth`: How many levels up the hierarchy to include (1 = current only)
|
||||
- `filters`: Optional filters (content_type, author, date_range, etc.)
|
||||
|
||||
**Returns:**
|
||||
- List of `ContextBlob` objects ordered by relevance
|
||||
|
||||
**Example:**
|
||||
```python
|
||||
# Get context from current path only
|
||||
context = context_get(depth=1)
|
||||
|
||||
# Get context from current path and 2 parent levels
|
||||
context = context_get(depth=3)
|
||||
|
||||
# Get context with filters
|
||||
context = context_get(
|
||||
depth=2,
|
||||
filters={
|
||||
"content_type": "documentation",
|
||||
"author": "claude",
|
||||
"since": "2025-01-01"
|
||||
}
|
||||
)
|
||||
```
|
||||
|
||||
### `context_search(query: str, scope: str = None) -> List[ContextBlob]`
|
||||
|
||||
Perform semantic search across context blobs.
|
||||
|
||||
**Parameters:**
|
||||
- `query`: Search query string
|
||||
- `scope`: Path scope to limit search (default: current path and children)
|
||||
|
||||
**Returns:**
|
||||
- List of `ContextBlob` objects ranked by relevance
|
||||
|
||||
**Example:**
|
||||
```python
|
||||
# Search within current scope
|
||||
results = context_search("error handling patterns")
|
||||
|
||||
# Search within specific scope
|
||||
results = context_search(
|
||||
"database connection",
|
||||
scope="/project/src/models"
|
||||
)
|
||||
```
|
||||
|
||||
## Context Manipulation API
|
||||
|
||||
### `context_push(path: str, blob: ContextBlob) -> str`
|
||||
|
||||
Add or update context at the specified path.
|
||||
|
||||
**Parameters:**
|
||||
- `path`: Target path for the context
|
||||
- `blob`: ContextBlob object containing content and metadata
|
||||
|
||||
**Returns:**
|
||||
- Blob ID of the created/updated context
|
||||
|
||||
**Example:**
|
||||
```python
|
||||
from hcfs.api import ContextBlob
|
||||
|
||||
# Create new context blob
|
||||
blob = ContextBlob(
|
||||
content="This module handles user authentication",
|
||||
content_type="documentation",
|
||||
tags=["auth", "security", "users"],
|
||||
metadata={"priority": "high"}
|
||||
)
|
||||
|
||||
# Push to specific path
|
||||
blob_id = context_push("/project/src/auth.py", blob)
|
||||
```
|
||||
|
||||
### `context_delete(path: str, blob_id: str = None) -> bool`
|
||||
|
||||
Delete context blob(s) at the specified path.
|
||||
|
||||
**Parameters:**
|
||||
- `path`: Target path
|
||||
- `blob_id`: Specific blob ID (if None, deletes all blobs at path)
|
||||
|
||||
**Returns:**
|
||||
- `True` if deletion successful
|
||||
- `False` if path/blob not found or permission denied
|
||||
|
||||
**Example:**
|
||||
```python
|
||||
# Delete specific blob
|
||||
success = context_delete("/project/src/auth.py", blob_id)
|
||||
|
||||
# Delete all context at path
|
||||
success = context_delete("/project/old_module/")
|
||||
```
|
||||
|
||||
### `context_update(blob_id: str, updates: dict) -> bool`
|
||||
|
||||
Update an existing context blob.
|
||||
|
||||
**Parameters:**
|
||||
- `blob_id`: ID of blob to update
|
||||
- `updates`: Dictionary of fields to update
|
||||
|
||||
**Returns:**
|
||||
- `True` if update successful
|
||||
- `False` if blob not found or permission denied
|
||||
|
||||
**Example:**
|
||||
```python
|
||||
# Update blob content and tags
|
||||
success = context_update(blob_id, {
|
||||
"content": "Updated documentation with new examples",
|
||||
"tags": ["auth", "security", "users", "examples"]
|
||||
})
|
||||
```
|
||||
|
||||
## Subscription API
|
||||
|
||||
### `context_subscribe(path: str, callback: Callable, filters: dict = None) -> str`
|
||||
|
||||
Subscribe to context changes at the specified path.
|
||||
|
||||
**Parameters:**
|
||||
- `path`: Path to monitor
|
||||
- `callback`: Function to call when changes occur
|
||||
- `filters`: Optional filters for subscription
|
||||
|
||||
**Returns:**
|
||||
- Subscription ID string
|
||||
|
||||
**Example:**
|
||||
```python
|
||||
def on_context_change(event):
|
||||
print(f"Context changed at {event.path}: {event.change_type}")
|
||||
|
||||
# Subscribe to changes in current directory
|
||||
sub_id = context_subscribe(
|
||||
"/project/src/",
|
||||
callback=on_context_change,
|
||||
filters={"change_type": ["create", "update"]}
|
||||
)
|
||||
```
|
||||
|
||||
### `context_unsubscribe(subscription_id: str) -> bool`
|
||||
|
||||
Cancel a context subscription.
|
||||
|
||||
**Parameters:**
|
||||
- `subscription_id`: ID returned from `context_subscribe`
|
||||
|
||||
**Returns:**
|
||||
- `True` if unsubscribe successful
|
||||
- `False` if subscription not found
|
||||
|
||||
**Example:**
|
||||
```python
|
||||
success = context_unsubscribe(sub_id)
|
||||
```
|
||||
|
||||
## Data Models
|
||||
|
||||
### ContextBlob
|
||||
|
||||
```python
|
||||
class ContextBlob:
|
||||
id: str # Unique blob identifier
|
||||
content: str # Main content text
|
||||
content_type: str # Type: "code", "documentation", "config", etc.
|
||||
tags: List[str] # Searchable tags
|
||||
metadata: Dict[str, Any] # Additional metadata
|
||||
author: str # Creator identifier
|
||||
created_at: datetime # Creation timestamp
|
||||
updated_at: datetime # Last update timestamp
|
||||
version: int # Version number
|
||||
parent_version: Optional[str] # Parent blob ID if forked
|
||||
```
|
||||
|
||||
### ContextPath
|
||||
|
||||
```python
|
||||
class ContextPath:
|
||||
path: str # Full path string
|
||||
components: List[str] # Path components
|
||||
depth: int # Depth from root
|
||||
is_absolute: bool # True if absolute path
|
||||
exists: bool # True if path has context
|
||||
```
|
||||
|
||||
### ContextQuery
|
||||
|
||||
```python
|
||||
class ContextQuery:
|
||||
query: str # Search query
|
||||
filters: Dict[str, Any] # Search filters
|
||||
scope: str # Search scope path
|
||||
limit: int # Max results
|
||||
offset: int # Results offset
|
||||
sort_by: str # Sort field
|
||||
sort_order: str # "asc" or "desc"
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
All API functions raise specific exceptions for different error conditions:
|
||||
|
||||
- `PathNotFoundError`: Path does not exist
|
||||
- `PermissionDeniedError`: Insufficient permissions
|
||||
- `InvalidPathError`: Malformed path syntax
|
||||
- `ContextNotFoundError`: Context blob not found
|
||||
- `ValidationError`: Invalid data provided
|
||||
- `StorageError`: Backend storage error
|
||||
|
||||
**Example:**
|
||||
```python
|
||||
from hcfs.api import PathNotFoundError, PermissionDeniedError
|
||||
|
||||
try:
|
||||
context = context_get(depth=2)
|
||||
except PathNotFoundError:
|
||||
print("Current path has no context")
|
||||
except PermissionDeniedError:
|
||||
print("Access denied to context")
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
API behavior can be configured via `HCFSConfig`:
|
||||
|
||||
```python
|
||||
from hcfs.utils import HCFSConfig
|
||||
|
||||
config = HCFSConfig(
|
||||
max_depth=10, # Maximum traversal depth
|
||||
cache_size=1000, # LRU cache size
|
||||
default_content_type="text", # Default blob content type
|
||||
enable_versioning=True, # Enable blob versioning
|
||||
subscription_timeout=300 # Subscription timeout (seconds)
|
||||
)
|
||||
```
|
||||
146
docs/ARCHITECTURE.md
Normal file
146
docs/ARCHITECTURE.md
Normal file
@@ -0,0 +1,146 @@
|
||||
# HCFS Architecture
|
||||
|
||||
## Overview
|
||||
|
||||
The Hierarchical Context File System (HCFS) is designed as a layered architecture that bridges filesystem navigation with semantic context storage and retrieval.
|
||||
|
||||
## System Components
|
||||
|
||||
### 1. Virtual Filesystem Layer (`src/hcfs/filesystem/`)
|
||||
|
||||
The virtual filesystem presents a standard POSIX-like directory structure backed by context blobs rather than traditional files.
|
||||
|
||||
**Key Components:**
|
||||
- **HCFSFilesystem**: Main filesystem interface
|
||||
- **HCFSFuseOperations**: FUSE-based filesystem operations (readdir, getattr, etc.)
|
||||
|
||||
**Responsibilities:**
|
||||
- Present hierarchical path structure to agents
|
||||
- Map filesystem operations to context queries
|
||||
- Handle path-based navigation (`cd`, `ls`, etc.)
|
||||
- Maintain current context scope per session
|
||||
|
||||
### 2. Storage Backend (`src/hcfs/storage/`)
|
||||
|
||||
The storage layer manages persistent context blob storage with versioning and metadata.
|
||||
|
||||
**Key Components:**
|
||||
- **ContextStorage**: Abstract storage interface
|
||||
- **SQLiteBackend**: SQLite-based implementation
|
||||
- **StoredContextBlob**: Storage data models
|
||||
- **ContextMetadata**: Metadata and versioning
|
||||
|
||||
**Responsibilities:**
|
||||
- Persist context blobs with versioning
|
||||
- Store path-to-context mappings
|
||||
- Manage hierarchical inheritance relationships
|
||||
- Provide ACID guarantees for context operations
|
||||
|
||||
### 3. Indexing & Semantic Search (`src/hcfs/indexing/`)
|
||||
|
||||
The indexing layer provides semantic search capabilities over context blobs.
|
||||
|
||||
**Key Components:**
|
||||
- **EmbeddingEngine**: Generate embeddings for context content
|
||||
- **SemanticSearch**: Vector similarity search
|
||||
- **HybridRanker**: Combines BM25 + embedding scores
|
||||
|
||||
**Responsibilities:**
|
||||
- Generate and store embeddings for context blobs
|
||||
- Provide semantic similarity search
|
||||
- Rank results by relevance (hybrid BM25 + vector)
|
||||
- Support context folding and summarization
|
||||
|
||||
### 4. Agent API (`src/hcfs/api/`)
|
||||
|
||||
The API layer exposes syscall-style functions for agent interaction.
|
||||
|
||||
**Key Components:**
|
||||
- **ContextAPI**: Main agent-facing API
|
||||
- **ContextBlob**: Context data models
|
||||
- **ContextPath**: Path representation
|
||||
- **ContextQuery**: Query models
|
||||
|
||||
**Core API Functions:**
|
||||
```python
|
||||
# Navigation
|
||||
context_cd(path: str) -> bool
|
||||
context_pwd() -> str
|
||||
|
||||
# Retrieval
|
||||
context_get(depth: int = 1) -> List[ContextBlob]
|
||||
context_list(path: str = None) -> List[str]
|
||||
|
||||
# Manipulation
|
||||
context_push(path: str, blob: ContextBlob) -> str
|
||||
context_delete(path: str, blob_id: str) -> bool
|
||||
|
||||
# Subscription
|
||||
context_subscribe(path: str, callback: Callable) -> str
|
||||
context_unsubscribe(subscription_id: str) -> bool
|
||||
```
|
||||
|
||||
### 5. Utilities (`src/hcfs/utils/`)
|
||||
|
||||
Common utilities and configuration management.
|
||||
|
||||
**Key Components:**
|
||||
- **HCFSConfig**: Configuration management
|
||||
- **path_utils**: Path manipulation utilities
|
||||
- **logging**: Structured logging
|
||||
|
||||
## Data Flow
|
||||
|
||||
```
|
||||
Agent → ContextAPI → HCFSFilesystem → ContextStorage
|
||||
↓ ↓
|
||||
SemanticSearch ← EmbeddingEngine
|
||||
```
|
||||
|
||||
### Example: Context Retrieval
|
||||
|
||||
1. Agent calls `context_cd("/project/src/")`
|
||||
2. ContextAPI validates path and sets current scope
|
||||
3. HCFSFilesystem updates virtual directory state
|
||||
4. Agent calls `context_get(depth=2)`
|
||||
5. ContextAPI queries ContextStorage for context at `/project/src/` and `/project/`
|
||||
6. SemanticSearch ranks and filters results
|
||||
7. Merged context returned to agent
|
||||
|
||||
### Example: Context Publishing
|
||||
|
||||
1. Agent calls `context_push("/project/src/module.py", blob)`
|
||||
2. ContextAPI validates blob and path
|
||||
3. EmbeddingEngine generates embeddings for blob content
|
||||
4. ContextStorage persists blob with versioning
|
||||
5. Subscription notifications sent to interested agents
|
||||
|
||||
## Hierarchical Inheritance
|
||||
|
||||
Context blobs inherit from parent paths using configurable strategies:
|
||||
|
||||
- **Append**: Child context appends to parent context
|
||||
- **Override**: Child context overrides parent context
|
||||
- **Merge**: Intelligent merging based on content type
|
||||
- **Isolate**: No inheritance, child context standalone
|
||||
|
||||
## Concurrency & Consistency
|
||||
|
||||
- **Read Scalability**: Multiple agents can read simultaneously
|
||||
- **Write Coordination**: Optimistic locking with conflict resolution
|
||||
- **Versioning**: All context changes create new versions
|
||||
- **Subscription**: Pub/sub notifications for context changes
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
- **Caching**: LRU cache for frequently accessed contexts
|
||||
- **Lazy Loading**: Context blobs loaded on-demand
|
||||
- **Batch Operations**: Bulk context operations for efficiency
|
||||
- **Index Optimization**: Separate indices for path, content, and metadata queries
|
||||
|
||||
## Security Model
|
||||
|
||||
- **Path Permissions**: ACL-based access control per path
|
||||
- **Agent Authentication**: Token-based agent identification
|
||||
- **Content Validation**: Schema validation for context blobs
|
||||
- **Audit Logging**: All context operations logged for accountability
|
||||
Reference in New Issue
Block a user