# P2P Package ## Overview The `p2p` package provides the foundational peer-to-peer networking infrastructure for CHORUS. It wraps libp2p to create and manage P2P nodes with transport security, peer discovery, DHT integration, and connection management. This package forms the network layer upon which PubSub, DHT, and all other distributed CHORUS components operate. **Package Path:** `/home/tony/chorus/project-queues/active/CHORUS/p2p/` **Key Features:** - libp2p Host wrapper with security (Noise protocol) - TCP transport with configurable listen addresses - Optional Kademlia DHT for distributed peer discovery - Connection manager with watermarks for scaling - Rate limiting (dial rate, concurrent dials, DHT queries) - Bootstrap peer support - Relay support for NAT traversal - Background connection status monitoring ## Architecture ### Core Components ``` Node ├── host - libp2p Host (network identity and connections) ├── ctx/cancel - Context for lifecycle management ├── config - Configuration (listen addresses, DHT, limits) └── dht - Optional LibP2PDHT for distributed discovery Config ├── Network Settings │ ├── ListenAddresses - Multiaddrs to listen on │ └── NetworkID - Network identifier ├── Discovery Settings │ ├── EnableMDNS - Local peer discovery │ └── MDNSServiceTag - mDNS service name ├── DHT Settings │ ├── EnableDHT - Distributed discovery │ ├── DHTMode - client/server/auto │ ├── DHTBootstrapPeers - Bootstrap peer addresses │ └── DHTProtocolPrefix - DHT protocol namespace ├── Connection Limits │ ├── MaxConnections - Total connection limit │ ├── MaxPeersPerIP - Anti-spam limit │ ├── ConnectionTimeout - Connection timeout │ ├── LowWatermark - Minimum connections to maintain │ └── HighWatermark - Trim connections above this ├── Rate Limiting │ ├── DialsPerSecond - Outbound dial rate limit │ ├── MaxConcurrentDials - Concurrent outbound dials │ ├── MaxConcurrentDHT - Concurrent DHT queries │ └── JoinStaggerMS - Topic join delay (anti-thundering herd) └── Security └── EnableSecurity - Noise protocol encryption ``` ## Multiaddr Listen Addresses ### Default Configuration ``` /ip4/0.0.0.0/tcp/3333 - Listen on all IPv4 interfaces, port 3333 /ip6/::/tcp/3333 - Listen on all IPv6 interfaces, port 3333 ``` ### Multiaddr Format libp2p uses multiaddrs for network addresses: ``` /ip4//tcp/ - IPv4 TCP /ip6//tcp/ - IPv6 TCP /ip4//tcp//p2p/ - Full peer address /dns4//tcp/ - DNS-based address /dns6//tcp/ - DNS6-based address ``` ### Examples ```go // Listen on all interfaces, port 3333 "/ip4/0.0.0.0/tcp/3333" // Listen on localhost only "/ip4/127.0.0.1/tcp/3333" // Listen on specific IP "/ip4/192.168.1.100/tcp/3333" // Multiple addresses []string{ "/ip4/0.0.0.0/tcp/3333", "/ip6/::/tcp/3333", } ``` ## Configuration ### Default Configuration ```go func DefaultConfig() *Config { return &Config{ // Network settings ListenAddresses: []string{ "/ip4/0.0.0.0/tcp/3333", "/ip6/::/tcp/3333", }, NetworkID: "CHORUS-network", // Discovery settings - mDNS disabled for Swarm by default EnableMDNS: false, MDNSServiceTag: "CHORUS-peer-discovery", // DHT settings (disabled by default for local development) EnableDHT: false, DHTBootstrapPeers: []string{}, DHTMode: "auto", DHTProtocolPrefix: "/CHORUS", // Connection limits and rate limiting for scaling MaxConnections: 50, MaxPeersPerIP: 3, ConnectionTimeout: 30 * time.Second, LowWatermark: 32, // Keep at least 32 connections HighWatermark: 128, // Trim above 128 connections DialsPerSecond: 5, // Limit outbound dials to prevent storms MaxConcurrentDials: 10, // Maximum concurrent outbound dials MaxConcurrentDHT: 16, // Maximum concurrent DHT queries JoinStaggerMS: 0, // No stagger by default // Security enabled by default EnableSecurity: true, // Pubsub for coordination and meta-discussion EnablePubsub: true, BzzzTopic: "CHORUS/coordination/v1", HmmmTopic: "hmmm/meta-discussion/v1", MessageValidationTime: 10 * time.Second, } } ``` ### Configuration Options #### WithListenAddresses ```go func WithListenAddresses(addrs ...string) Option ``` Sets the multiaddrs to listen on. **Example:** ```go cfg := p2p.DefaultConfig() opt := p2p.WithListenAddresses("/ip4/0.0.0.0/tcp/4444", "/ip6/::/tcp/4444") ``` #### WithNetworkID ```go func WithNetworkID(networkID string) Option ``` Sets the network identifier (informational). #### WithMDNS ```go func WithMDNS(enabled bool) Option ``` Enables or disables mDNS local peer discovery. **Note:** Disabled by default in container environments (Docker Swarm). #### WithMDNSServiceTag ```go func WithMDNSServiceTag(tag string) Option ``` Sets the mDNS service tag for discovery. #### WithDHT ```go func WithDHT(enabled bool) Option ``` Enables or disables Kademlia DHT for distributed peer discovery. #### WithDHTBootstrapPeers ```go func WithDHTBootstrapPeers(peers []string) Option ``` Sets bootstrap peer multiaddrs for DHT initialization. **Example:** ```go opt := p2p.WithDHTBootstrapPeers([]string{ "/ip4/192.168.1.100/tcp/3333/p2p/12D3KooWABC...", "/ip4/192.168.1.101/tcp/3333/p2p/12D3KooWXYZ...", }) ``` #### WithDHTMode ```go func WithDHTMode(mode string) Option ``` Sets DHT mode: "client", "server", or "auto". - **client:** Only queries DHT, doesn't serve records - **server:** Queries and serves DHT records - **auto:** Adapts based on network position (NAT detection) #### WithDHTProtocolPrefix ```go func WithDHTProtocolPrefix(prefix string) Option ``` Sets DHT protocol namespace (default: "/CHORUS"). #### WithMaxConnections ```go func WithMaxConnections(max int) Option ``` Sets maximum total connections. #### WithConnectionTimeout ```go func WithConnectionTimeout(timeout time.Duration) Option ``` Sets connection establishment timeout. #### WithSecurity ```go func WithSecurity(enabled bool) Option ``` Enables or disables transport security (Noise protocol). **Warning:** Should always be enabled in production. #### WithPubsub ```go func WithPubsub(enabled bool) Option ``` Enables or disables pubsub (informational, not enforced by p2p package). #### WithTopics ```go func WithTopics(chorusTopic, hmmmTopic string) Option ``` Sets Bzzz and HMMM topic names (informational). #### WithConnectionManager ```go func WithConnectionManager(low, high int) Option ``` Sets connection manager watermarks. - **low:** Minimum connections to maintain - **high:** Trim connections when exceeded **Example:** ```go opt := p2p.WithConnectionManager(32, 128) ``` #### WithDialRateLimit ```go func WithDialRateLimit(dialsPerSecond, maxConcurrent int) Option ``` Sets dial rate limiting to prevent connection storms. **Example:** ```go opt := p2p.WithDialRateLimit(5, 10) // 5 dials/sec, max 10 concurrent ``` #### WithDHTRateLimit ```go func WithDHTRateLimit(maxConcurrentDHT int) Option ``` Sets maximum concurrent DHT queries. #### WithJoinStagger ```go func WithJoinStagger(delayMS int) Option ``` Sets join stagger delay in milliseconds to prevent thundering herd on topic joins. **Example:** ```go opt := p2p.WithJoinStagger(100) // 100ms delay ``` ## API Reference ### Node Creation #### NewNode ```go func NewNode(ctx context.Context, opts ...Option) (*Node, error) ``` Creates a new P2P node with the given configuration. **Parameters:** - `ctx` - Context for lifecycle management - `opts` - Configuration options (variadic) **Returns:** Node instance or error **Security:** - Noise protocol for transport encryption - Message signing for all pubsub messages - Strict signature verification **Transports:** - TCP (default and primary) - Relay support for NAT traversal **Example:** ```go node, err := p2p.NewNode(ctx, p2p.WithListenAddresses("/ip4/0.0.0.0/tcp/3333"), p2p.WithDHT(true), p2p.WithDHTBootstrapPeers(bootstrapPeers), p2p.WithConnectionManager(32, 128), ) if err != nil { log.Fatal(err) } defer node.Close() ``` ### Node Information #### Host ```go func (n *Node) Host() host.Host ``` Returns the underlying libp2p Host interface. **Returns:** libp2p Host (used for PubSub, DHT, protocols) **Example:** ```go h := node.Host() peerID := h.ID() addrs := h.Addrs() ``` #### ID ```go func (n *Node) ID() peer.ID ``` Returns the peer ID of this node. **Returns:** libp2p peer.ID **Example:** ```go id := node.ID() fmt.Printf("Node ID: %s\n", id.String()) fmt.Printf("Short ID: %s\n", id.ShortString()) ``` #### Addresses ```go func (n *Node) Addresses() []multiaddr.Multiaddr ``` Returns the multiaddresses this node is listening on. **Returns:** Slice of multiaddrs **Example:** ```go addrs := node.Addresses() for _, addr := range addrs { fmt.Printf("Listening on: %s\n", addr.String()) } ``` ### Peer Connection #### Connect ```go func (n *Node) Connect(ctx context.Context, addr string) error ``` Connects to a peer at the given multiaddress. **Parameters:** - `ctx` - Context with optional timeout - `addr` - Full multiaddr including peer ID **Returns:** error if connection fails **Example:** ```go ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) defer cancel() err := node.Connect(ctx, "/ip4/192.168.1.100/tcp/3333/p2p/12D3KooWABC...") if err != nil { log.Printf("Connection failed: %v", err) } ``` #### Peers ```go func (n *Node) Peers() []peer.ID ``` Returns the list of connected peer IDs. **Returns:** Slice of peer IDs **Example:** ```go peers := node.Peers() fmt.Printf("Connected to %d peers\n", len(peers)) for _, p := range peers { fmt.Printf(" - %s\n", p.ShortString()) } ``` #### ConnectedPeers ```go func (n *Node) ConnectedPeers() int ``` Returns the number of connected peers. **Returns:** Integer count **Example:** ```go count := node.ConnectedPeers() fmt.Printf("Connected peers: %d\n", count) ``` ### DHT Support #### DHT ```go func (n *Node) DHT() *dht.LibP2PDHT ``` Returns the DHT instance (if enabled). **Returns:** LibP2PDHT instance or nil **Example:** ```go if node.IsDHTEnabled() { dht := node.DHT() // Use DHT for distributed operations } ``` #### IsDHTEnabled ```go func (n *Node) IsDHTEnabled() bool ``` Returns whether DHT is enabled and active. **Returns:** Boolean #### Bootstrap ```go func (n *Node) Bootstrap() error ``` Bootstraps the DHT by connecting to configured bootstrap peers. **Returns:** error if DHT not enabled or bootstrap fails **Example:** ```go if node.IsDHTEnabled() { if err := node.Bootstrap(); err != nil { log.Printf("Bootstrap failed: %v", err) } } ``` ### Lifecycle #### Close ```go func (n *Node) Close() error ``` Shuts down the node, closes DHT, and terminates all connections. **Returns:** error if shutdown fails **Example:** ```go defer node.Close() ``` ## Background Tasks ### Connection Status Monitoring Node automatically runs background monitoring every 30 seconds: ``` 🐝 Bzzz Node Status - ID: 12D3Koo...abc, Connected Peers: 5 Connected to: 12D3Koo...def, 12D3Koo...ghi, ... ``` Logs: - Node peer ID (short form) - Number of connected peers - List of connected peer IDs ### Monitoring Implementation ```go func (n *Node) startBackgroundTasks() { ticker := time.NewTicker(30 * time.Second) defer ticker.Stop() for { select { case <-n.ctx.Done(): return case <-ticker.C: n.logConnectionStatus() } } } ``` ## Security ### Transport Security All connections encrypted using **Noise Protocol Framework**: ```go libp2p.Security(noise.ID, noise.New) ``` **Features:** - Forward secrecy - Mutual authentication - Encrypted payloads - Prevents eavesdropping and tampering ### Connection Limits Anti-spam and DoS protection: ```go MaxConnections: 50 // Total connection limit MaxPeersPerIP: 3 // Limit connections per IP ``` ### Rate Limiting Prevents connection storms: ```go DialsPerSecond: 5 // Limit outbound dial rate MaxConcurrentDials: 10 // Limit concurrent dials MaxConcurrentDHT: 16 // Limit DHT query load ``` ### Identity Each node has a cryptographic identity: - **Peer ID:** Derived from public key (e.g., `12D3KooW...`) - **Key Pair:** ED25519 or RSA (managed by libp2p) - **Authentication:** All connections authenticated ## DHT Integration ### Kademlia DHT CHORUS uses Kademlia DHT for distributed peer discovery and content routing. ### DHT Modes **Client Mode:** - Queries DHT for peer discovery - Does not serve DHT records - Lower resource usage - Suitable for ephemeral agents **Server Mode:** - Queries and serves DHT records - Contributes to network health - Higher resource usage - Suitable for long-running nodes **Auto Mode:** - Adapts based on network position - Detects NAT and chooses client/server - Recommended for most deployments ### DHT Protocol Prefix Isolates CHORUS DHT from other libp2p networks: ```go DHTProtocolPrefix: "/CHORUS" ``` Results in protocol IDs like: ``` /CHORUS/kad/1.0.0 ``` ### Bootstrap Process 1. Node connects to bootstrap peers 2. Performs DHT queries to find nearby peers 3. Populates routing table 4. Becomes part of DHT mesh **Example:** ```go node, err := p2p.NewNode(ctx, p2p.WithDHT(true), p2p.WithDHTMode("server"), p2p.WithDHTBootstrapPeers([]string{ "/ip4/192.168.1.100/tcp/3333/p2p/12D3KooWABC...", }), ) if err := node.Bootstrap(); err != nil { log.Printf("Bootstrap failed: %v", err) } ``` ## Connection Management ### Watermarks Connection manager maintains healthy connection count: ```go LowWatermark: 32 // Maintain at least 32 connections HighWatermark: 128 // Trim connections above 128 ``` **Behavior:** - Below low watermark: Actively seek new connections - Between watermarks: Maintain existing connections - Above high watermark: Trim least valuable connections ### Connection Trimming When above high watermark: 1. Rank connections by value (recent messages, protocols used) 2. Trim lowest-value connections 3. Bring count to low watermark ### Rate Limiting **Dial Rate Limiting:** ```go DialsPerSecond: 5 // Max 5 outbound dials per second MaxConcurrentDials: 10 // Max 10 concurrent outbound dials ``` Prevents: - Connection storms - Network congestion - Resource exhaustion **DHT Rate Limiting:** ```go MaxConcurrentDHT: 16 // Max 16 concurrent DHT queries ``` Prevents: - DHT query storms - CPU exhaustion - Network bandwidth saturation ### Join Stagger Prevents thundering herd on pubsub topic joins: ```go JoinStaggerMS: 100 // 100ms delay between topic joins ``` Useful for: - Large-scale deployments - Role-based topic joins - Coordinated restarts ## Usage Examples ### Basic Node ```go ctx := context.Background() // Create node with default config node, err := p2p.NewNode(ctx) if err != nil { log.Fatal(err) } defer node.Close() fmt.Printf("Node ID: %s\n", node.ID().String()) for _, addr := range node.Addresses() { fmt.Printf("Listening on: %s\n", addr.String()) } ``` ### Custom Configuration ```go node, err := p2p.NewNode(ctx, p2p.WithListenAddresses("/ip4/0.0.0.0/tcp/4444"), p2p.WithNetworkID("CHORUS-prod"), p2p.WithConnectionManager(50, 200), p2p.WithDialRateLimit(10, 20), ) ``` ### DHT-Enabled Node ```go bootstrapPeers := []string{ "/ip4/192.168.1.100/tcp/3333/p2p/12D3KooWABC...", "/ip4/192.168.1.101/tcp/3333/p2p/12D3KooWXYZ...", } node, err := p2p.NewNode(ctx, p2p.WithDHT(true), p2p.WithDHTMode("server"), p2p.WithDHTBootstrapPeers(bootstrapPeers), p2p.WithDHTProtocolPrefix("/CHORUS"), ) if err != nil { log.Fatal(err) } // Bootstrap DHT if err := node.Bootstrap(); err != nil { log.Printf("Bootstrap warning: %v", err) } ``` ### Connecting to Peers ```go // Connect to specific peer peerAddr := "/ip4/192.168.1.100/tcp/3333/p2p/12D3KooWABC..." ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) defer cancel() if err := node.Connect(ctx, peerAddr); err != nil { log.Printf("Failed to connect: %v", err) } else { fmt.Printf("Connected to %d peers\n", node.ConnectedPeers()) } ``` ### Integration with PubSub ```go // Create P2P node node, err := p2p.NewNode(ctx, p2p.WithListenAddresses("/ip4/0.0.0.0/tcp/3333"), p2p.WithDHT(true), ) if err != nil { log.Fatal(err) } defer node.Close() // Create PubSub using node's host ps, err := pubsub.NewPubSub(ctx, node.Host(), "CHORUS/coordination/v1", "hmmm/meta-discussion/v1") if err != nil { log.Fatal(err) } defer ps.Close() // Now use PubSub for messaging ps.PublishBzzzMessage(pubsub.TaskAnnouncement, map[string]interface{}{ "task_id": "task-123", }) ``` ### High-Scale Configuration ```go node, err := p2p.NewNode(ctx, // Network p2p.WithListenAddresses("/ip4/0.0.0.0/tcp/3333"), p2p.WithNetworkID("CHORUS-prod"), // Discovery p2p.WithDHT(true), p2p.WithDHTMode("server"), p2p.WithDHTBootstrapPeers(bootstrapPeers), // Connection limits p2p.WithMaxConnections(500), p2p.WithConnectionManager(100, 300), // Rate limiting p2p.WithDialRateLimit(10, 30), p2p.WithDHTRateLimit(32), // Anti-thundering herd p2p.WithJoinStagger(100), ) ``` ## Deployment Patterns ### Docker Swarm Deployment In Docker Swarm, configure nodes to listen on all interfaces: ```go p2p.WithListenAddresses("/ip4/0.0.0.0/tcp/3333", "/ip6/::/tcp/3333") ``` **Docker Compose:** ```yaml services: chorus-agent: image: anthonyrawlins/chorus:latest ports: - "3333:3333" environment: - CHORUS_P2P_PORT=3333 - CHORUS_DHT_ENABLED=true ``` ### Kubernetes Deployment Use service discovery for bootstrap peers: ```go bootstrapPeers := getBootstrapPeersFromService("chorus-agent-headless.default.svc.cluster.local") node, err := p2p.NewNode(ctx, p2p.WithDHT(true), p2p.WithDHTBootstrapPeers(bootstrapPeers), ) ``` ### Local Development Disable DHT for faster startup: ```go node, err := p2p.NewNode(ctx, p2p.WithListenAddresses("/ip4/127.0.0.1/tcp/3333"), p2p.WithDHT(false), ) ``` ### Behind NAT Use relay and DHT client mode: ```go node, err := p2p.NewNode(ctx, p2p.WithDHT(true), p2p.WithDHTMode("client"), p2p.WithDHTBootstrapPeers(publicBootstrapPeers), ) ``` ## Best Practices ### Network Configuration 1. **Production:** Use server DHT mode on stable nodes 2. **Ephemeral Agents:** Use client DHT mode for short-lived agents 3. **NAT Traversal:** Enable relay and use public bootstrap peers 4. **Local Testing:** Disable DHT for faster development ### Connection Management 1. **Set Appropriate Watermarks:** - Small deployments: 10-50 connections - Medium deployments: 50-200 connections - Large deployments: 200-500 connections 2. **Rate Limiting:** - Prevent connection storms during restarts - Set MaxPeersPerIP=3 to prevent single-peer spam - Use join stagger for coordinated deployments 3. **Bootstrap Peers:** - Use 3-5 reliable bootstrap peers - Distribute bootstrap peers across network - Use stable, long-running nodes as bootstrap ### Security 1. **Always Enable Security:** - Use Noise protocol in production - Never disable security except for local testing 2. **Connection Limits:** - Set MaxConnections based on resources - Set MaxPeersPerIP=3 to prevent IP-based attacks - Monitor connection counts 3. **Peer Validation:** - Validate peer behavior - Implement reputation systems - Disconnect misbehaving peers ### Monitoring 1. **Log Connection Status:** - Monitor ConnectedPeers() periodically - Alert on low peer counts - Track peer churn rate 2. **DHT Health:** - Monitor DHT routing table size - Track DHT query success rates - Alert on bootstrap failures 3. **Resource Usage:** - Monitor bandwidth consumption - Track CPU usage (DHT queries) - Monitor memory (connection state) ## Troubleshooting ### Connection Issues **Problem:** No peers connecting **Solutions:** - Check firewall rules (port 3333) - Verify listen addresses are correct - Check bootstrap peer addresses - Enable DHT for discovery - Verify network connectivity **Problem:** Connection storms **Solutions:** - Enable dial rate limiting - Use join stagger - Check MaxConcurrentDials - Reduce DialsPerSecond ### DHT Issues **Problem:** DHT bootstrap fails **Solutions:** - Verify bootstrap peer addresses - Check network connectivity - Use DHT client mode if behind NAT - Increase bootstrap peer count **Problem:** DHT queries slow **Solutions:** - Check MaxConcurrentDHT limit - Monitor network latency - Use closer bootstrap peers - Consider server DHT mode ### Performance Issues **Problem:** High CPU usage **Solutions:** - Reduce MaxConnections - Lower MaxConcurrentDHT - Check for message storms - Use client DHT mode **Problem:** High bandwidth usage **Solutions:** - Reduce connection watermarks - Lower message validation rate - Check for message spam - Monitor pubsub traffic ## Related Documentation - **PubSub Package:** `/home/tony/chorus/project-queues/active/CHORUS/docs/comprehensive/packages/pubsub.md` - Messaging layer - **DHT Package:** `/home/tony/chorus/project-queues/active/CHORUS/docs/comprehensive/packages/dht.md` - Distributed storage - **CHORUS Agent:** `/home/tony/chorus/project-queues/active/CHORUS/docs/comprehensive/commands/chorus-agent.md` - Agent runtime ## Implementation Details ### libp2p Stack ``` Application Layer (PubSub, DHT, Protocols) | Host Interface (peer.ID, multiaddr, connections) | Transport Security (Noise Protocol) | Stream Multiplexing (yamux/mplex) | Transport Layer (TCP) | Operating System Network Stack ``` ### Peer ID Format ``` 12D3KooWABCDEF1234567890... - Base58 encoded multihash | └── Derived from public key (ED25519 or RSA) ``` ### Connection Lifecycle 1. **Dial:** Initiate connection to peer multiaddr 2. **Security Handshake:** Noise protocol handshake 3. **Multiplexer Negotiation:** Choose yamux or mplex 4. **Protocol Negotiation:** Exchange supported protocols 5. **Connected:** Connection established, protocols available 6. **Disconnected:** Connection closed, cleanup state ### Error Handling - Network errors logged but not fatal - Connection failures retry with backoff - DHT errors logged and continue - Invalid multiaddrs fail immediately ## Source Files - `/home/tony/chorus/project-queues/active/CHORUS/p2p/node.go` - Main implementation (202 lines) - `/home/tony/chorus/project-queues/active/CHORUS/p2p/config.go` - Configuration (209 lines) ## Performance Characteristics ### Connection Overhead - Memory per connection: ~50KB - CPU per connection: ~1% per 100 connections - Bandwidth per connection: ~1-10 KB/s idle ### Scaling - **Small:** 10-50 connections (single node testing) - **Medium:** 50-200 connections (cluster deployments) - **Large:** 200-500 connections (production clusters) - **Enterprise:** 500-1000 connections (dedicated infrastructure) ### DHT Performance - **Bootstrap:** 1-5 seconds (depends on network) - **Query Latency:** 100-500ms (depends on proximity) - **Routing Table:** 20-200 entries (typical) - **DHT Memory:** ~1MB per 100 routing table entries