Rust workspace with 5 crates (mesh-types, mesh-crypto, mesh-network, mesh-validator, mesh-wallet), PROJECT_CONSTITUTION.md for CHORUS automated ingestion, and the full MESH protocol specification suite. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
827 lines
32 KiB
Markdown
827 lines
32 KiB
Markdown
**MESH PROTOCOL**
|
||
|
||
*Specification Suite --- Addendum*
|
||
|
||
SPEC-004: Tier 2 DAG-BFT \| SPEC-005: Tier 3 aBFT \| SPEC-009: Network
|
||
Transport
|
||
|
||
Version 0.1.0-draft \| March 2026
|
||
|
||
**Table of Contents**
|
||
|
||
**SPEC-004: Tier 2 --- DAG-BFT Ordered Settlement**
|
||
|
||
**4.1 Purpose**
|
||
|
||
This specification defines the consensus protocol for settlements that
|
||
require global transaction ordering. Tier 2 handles the minority of
|
||
transactions where multiple parties contend over shared state:
|
||
multi-party atomic swaps, conditional settlements, cross-currency batch
|
||
auctions, and escrow operations with multiple possible outcomes.
|
||
|
||
Tier 2 is a clean-room DAG-based Byzantine Fault Tolerant consensus
|
||
protocol synthesised from the published academic literature on
|
||
uncertified DAG BFT. It achieves 3 message delay commit latency in the
|
||
common case and provides censorship resistance through its leaderless,
|
||
all-to-all communication pattern.
|
||
|
||
**4.2 Design Rationale**
|
||
|
||
**Why DAG-based?** Traditional leader-based BFT protocols (PBFT,
|
||
HotStuff) bottleneck on a single leader for each round. DAG-based
|
||
protocols allow every validator to propose blocks simultaneously,
|
||
achieving higher throughput. The DAG structure also provides natural
|
||
censorship resistance: a Byzantine leader in a traditional protocol can
|
||
selectively exclude transactions, but in a DAG every validator
|
||
independently includes transactions.
|
||
|
||
**Why uncertified?** Certified DAG protocols (DAGRider, Narwhal-Tusk,
|
||
Bullshark) require 3 additional message delays per round to certify each
|
||
block (broadcast → acknowledge → aggregate certificate). Uncertified
|
||
DAGs skip certification entirely: a block is simply broadcast and
|
||
referenced by later blocks. Equivocation is handled by the commit rule
|
||
rather than by pre-certification. This reduces round latency from 3
|
||
message delays to 1.
|
||
|
||
**Why not use existing implementations?** Mysticeti (Mysten Labs),
|
||
Shoal++ (Aptos/Diem heritage), and Mahi-Mahi are all implemented by
|
||
well-funded companies that may hold process patents on specific
|
||
optimisations. MESH's Tier 2 protocol is designed from the published
|
||
academic principles, avoiding any specific patented technique while
|
||
achieving comparable performance characteristics.
|
||
|
||
**4.3 DAG Structure**
|
||
|
||
**4.3.1 DAG Block**
|
||
|
||
struct DAGBlock {
|
||
|
||
author: PublicKey, // Proposing validator
|
||
|
||
round: u64, // Logical round number
|
||
|
||
references: Vec\<Hash\>, // Hashes of blocks from round-1 (\>= 2f+1
|
||
required)
|
||
|
||
weak_refs: Vec\<Hash\>, // Hashes of blocks from earlier rounds
|
||
(optional)
|
||
|
||
settlements: Vec\<Transition\>, // Ordered list of Tier 2 settlements
|
||
|
||
timestamp: u64, // Validator's local time (advisory only, not trusted)
|
||
|
||
signature: Signature, // Author's signature over all above fields
|
||
|
||
}
|
||
|
||
**4.3.2 Round Progression**
|
||
|
||
The DAG progresses in logical rounds. In each round, every correct
|
||
validator proposes exactly one block. A validator MAY advance to round
|
||
r+1 only after receiving blocks from at least 2f+1 distinct validators
|
||
in round r. This ensures that each round's blocks collectively reference
|
||
a quorum of the previous round, preventing Byzantine validators from
|
||
advancing the DAG unilaterally.
|
||
|
||
Formally, let B(v, r) denote the block proposed by validator v in round
|
||
r. The reference set of B(v, r) MUST satisfy:
|
||
|
||
- \|references\| ≥ 2f+1, where each reference is a hash of a block
|
||
from round r-1.
|
||
|
||
- No two references point to blocks by the same author (prevents
|
||
referencing equivocating blocks).
|
||
|
||
- The validator MUST have received the full content of each referenced
|
||
block (not just the hash).
|
||
|
||
**4.3.3 Equivocation in the DAG**
|
||
|
||
A Byzantine validator may attempt to propose two different blocks for
|
||
the same round (equivocation). In an uncertified DAG, this is not
|
||
prevented at the block level. Instead, the commit rule handles
|
||
equivocation:
|
||
|
||
1. If a correct validator v receives two blocks B₁ and B₂ from the same
|
||
author a for the same round r, v records both as an equivocation and
|
||
MUST NOT reference either in its own blocks.
|
||
|
||
2. The equivocating validator's blocks are excluded from the commit
|
||
rule for round r. This means the equivocator cannot benefit from
|
||
equivocation.
|
||
|
||
3. Equivocation evidence (both blocks) is included in v's next block as
|
||
a protocol message, alerting all validators.
|
||
|
||
**4.4 Commit Rule**
|
||
|
||
The commit rule determines when a block in the DAG is considered final.
|
||
This is the most critical component of the consensus protocol. MESH uses
|
||
a multi-leader commit rule that can commit multiple blocks per round,
|
||
maximising throughput.
|
||
|
||
**4.4.1 Leader Selection**
|
||
|
||
Each round has a predetermined set of leader validators. The leader for
|
||
round r is determined by a rotating schedule:
|
||
|
||
leader(r) = validators\[r mod n\]
|
||
|
||
where validators is the ordered list of validator public keys in the
|
||
current epoch. This is deterministic and known to all validators before
|
||
the round begins. A round's leader block is the block proposed by the
|
||
leader validator for that round.
|
||
|
||
**4.4.2 Direct Commit (Fast Path)**
|
||
|
||
A leader block B(leader, r) is directly committed if the following
|
||
conditions hold:
|
||
|
||
4. **Support condition:** At least 2f+1 blocks in round r+1 reference
|
||
B(leader, r) in their references set.
|
||
|
||
5. **No equivocation:** The leader validator did not equivocate in
|
||
round r (no second block exists).
|
||
|
||
6. **Availability:** The full content of B(leader, r) is available to
|
||
the committing validator.
|
||
|
||
When a validator observes these conditions (which it can determine
|
||
locally after receiving 2f+1 blocks from round r+1), it commits
|
||
B(leader, r) and all transactions within it. This achieves commit
|
||
latency of 3 message delays from when the leader's block is first
|
||
broadcast: 1 delay for the block to propagate, 1 delay for round r+1
|
||
blocks to be proposed, and 1 delay for the round r+1 blocks to propagate
|
||
back.
|
||
|
||
**4.4.3 Indirect Commit (Anchor Chain)**
|
||
|
||
Blocks that are not leader blocks are committed indirectly through the
|
||
causal history of committed leader blocks. When a leader block B(leader,
|
||
r) is committed, all blocks in its causal history that have not yet been
|
||
committed are also committed, in causal order.
|
||
|
||
Formally, when B(leader, r) is committed, the following set is
|
||
committed:
|
||
|
||
commit_set(B) = { B\' : B\' is in the causal history of B and B\' is not
|
||
yet committed }
|
||
|
||
The causal history of B is the transitive closure of B's references.
|
||
This means a single leader commit can commit hundreds of blocks from
|
||
earlier rounds, amortising the commit latency across many transactions.
|
||
|
||
**4.4.4 Skip Rule (Liveness)**
|
||
|
||
If a leader's block for round r does not achieve the direct commit
|
||
condition (e.g., the leader is Byzantine and equivocated, or the
|
||
leader's block was not received by enough validators), the protocol
|
||
proceeds to round r+1 with a new leader. The skipped leader's block (if
|
||
it exists and is non-equivocating) will be committed indirectly when a
|
||
subsequent leader commits.
|
||
|
||
To ensure liveness under Byzantine leaders, each correct validator sets
|
||
a timeout T_round for receiving the current leader's block. If the
|
||
timeout expires, the validator advances to the next round without
|
||
referencing the missing leader block. The timeout is calibrated to:
|
||
|
||
T_round = max(4 × median_RTT, 500ms)
|
||
|
||
where median_RTT is the median observed round-trip time to other
|
||
validators. This ensures that under normal network conditions, the
|
||
timeout does not fire (avoiding unnecessary leader skips), but under
|
||
sustained leader failure, progress continues within at most T_round.
|
||
|
||
**4.5 Transaction Ordering**
|
||
|
||
Once a set of blocks is committed, the transactions within them must be
|
||
deterministically ordered. All validators MUST produce the identical
|
||
ordering for the same set of committed blocks.
|
||
|
||
**4.5.1 Ordering Algorithm**
|
||
|
||
7. Start with the set of blocks to be ordered (the commit_set from
|
||
§4.4.3).
|
||
|
||
8. Topologically sort the blocks by their causal relationships (if B₁
|
||
is referenced by B₂, B₁ comes first).
|
||
|
||
9. Break ties (blocks at the same causal level with no ordering between
|
||
them) by sorting on block hash (SHA3-256 of the serialised block),
|
||
lexicographically ascending.
|
||
|
||
10. Concatenate the transaction lists from each block in the determined
|
||
order.
|
||
|
||
11. Remove duplicate transactions (same transition hash appearing in
|
||
multiple blocks). The first occurrence is kept; later occurrences
|
||
are discarded.
|
||
|
||
This produces a deterministic, total order over all transactions in the
|
||
committed set. The tie-breaking by hash ensures fairness: no validator
|
||
can predict which block will come first (because the hash depends on the
|
||
full block content), preventing ordering manipulation.
|
||
|
||
**4.6 Integration with Tier 1**
|
||
|
||
Tier 1 and Tier 2 operate concurrently on the same validator set. The
|
||
interaction rules are:
|
||
|
||
- **Tier 1 certificates are included in DAG blocks:** Validators
|
||
include recently received Tier 1 settlement certificates in their
|
||
DAG blocks. This does not re-order the Tier 1 settlements (they are
|
||
already final), but it checkpoints them into the ordered history for
|
||
auditability and epoch management.
|
||
|
||
- **Tier 2 settlements may depend on Tier 1 certificates:** A Tier 2
|
||
transition's causal_deps field may reference Tier 1 settlement
|
||
certificate hashes. This enables composable operations: e.g., a
|
||
multi-party escrow that depends on a prior simple payment.
|
||
|
||
- **Conflict between tiers:** If a Tier 2 settlement conflicts with a
|
||
Tier 1 settlement on the same account (both spending the same
|
||
balance), the Tier 1 settlement takes priority because it was
|
||
certified first. The conflicting Tier 2 settlement is rejected with
|
||
ERR_BALANCE_SPENT.
|
||
|
||
**4.7 Timing Analysis**
|
||
|
||
--------------------------- ---------------------- ---------------------
|
||
**Step** **Duration** **Cumulative**
|
||
|
||
Leader broadcasts block RTT/2 (≈50ms) 50ms
|
||
(round r)
|
||
|
||
Validators receive block, ≈10ms processing 60ms
|
||
include in round r+1 refs
|
||
|
||
Validators broadcast round RTT/2 (≈50ms) 110ms
|
||
r+1 blocks
|
||
|
||
Committing validator RTT/2 (≈50ms) 160ms
|
||
receives 2f+1 round r+1
|
||
blocks
|
||
|
||
Commit rule evaluation \<1ms 161ms
|
||
|
||
Transaction execution ≈10ms per batch 171ms
|
||
--------------------------- ---------------------- ---------------------
|
||
|
||
Total commit latency: approximately 170ms in the common case (non-faulty
|
||
leader, synchronous network). This is 3 message delays, matching the
|
||
theoretical lower bound for BFT consensus. Under Byzantine leader or
|
||
network delay, fallback adds one round per skipped leader, plus T_round
|
||
timeout.
|
||
|
||
Throughput: each round commits all transactions from all validators'
|
||
blocks, not just the leader's. With n=21 validators each proposing
|
||
blocks containing up to 10,000 transactions, each round commits up to
|
||
210,000 transactions. At one round per \~170ms, steady-state throughput
|
||
exceeds 1,000,000 TPS (theoretical). Practical throughput is bounded by
|
||
bandwidth and execution, expected to be 100,000-300,000 TPS.
|
||
|
||
**4.8 Formal Safety and Liveness Properties**
|
||
|
||
> **SAFETY (Tier 2):** If two correct validators commit transaction
|
||
> sequences S₁ and S₂, then one is a prefix of the other. That is,
|
||
> correct validators never disagree on the ordering of committed
|
||
> transactions. This holds as long as f \< n/3 validators are Byzantine,
|
||
> under the partial synchrony model.
|
||
>
|
||
> **LIVENESS (Tier 2):** After GST, every valid Tier 2 settlement
|
||
> submitted to a correct validator is committed within O(f × T_round)
|
||
> time. In the common case (non-faulty leader), commit occurs within 3Δ.
|
||
> Under sustained Byzantine leaders, at most f consecutive leaders can
|
||
> be skipped before a correct leader is reached.
|
||
>
|
||
> **CENSORSHIP RESISTANCE:** A Byzantine leader cannot prevent a
|
||
> transaction from being committed. Even if a leader excludes a
|
||
> transaction from its block, any of the other n-1 validators can
|
||
> include it in their block, and it will be committed when the next
|
||
> leader's block is committed (via indirect commit).
|
||
|
||
**SPEC-005: Tier 3 --- Asynchronous BFT Fallback**
|
||
|
||
**5.1 Purpose**
|
||
|
||
This specification defines the fallback consensus protocol that
|
||
activates when the network conditions degrade below what Tier 2's
|
||
partial synchrony assumption requires. Tier 3 provides liveness
|
||
guarantees under full asynchrony: it makes no assumptions about message
|
||
delivery times, and guarantees that transactions will eventually be
|
||
committed even if an adversary controls the network scheduler.
|
||
|
||
Tier 3 is the safety net. It is slower than Tier 2 (expected 1-3 seconds
|
||
latency) but guarantees the system never halts. In a well-functioning
|
||
network, Tier 3 is never activated. It exists to handle scenarios such
|
||
as: submarine cable cuts, major cloud provider outages, targeted network
|
||
partitions, or sustained DDoS attacks against specific validators.
|
||
|
||
**5.2 Activation Conditions**
|
||
|
||
Tier 3 activates automatically when Tier 2 fails to make progress.
|
||
Specifically:
|
||
|
||
12. **Timeout detection:** If a validator has not observed a new Tier 2
|
||
leader commit for 3 × T_round consecutive rounds (approximately 1.5
|
||
seconds at default settings), it enters Tier 3 mode.
|
||
|
||
13. **Quorum detection:** If a validator receives Tier 3 activation
|
||
messages from f+1 distinct validators, it enters Tier 3 mode
|
||
regardless of its own timeout state. This prevents split-brain where
|
||
some validators are in Tier 2 and others in Tier 3.
|
||
|
||
14. **Manual activation:** A governance action (SPEC-010) can force Tier
|
||
3 activation for the entire network. This is for emergency
|
||
scenarios.
|
||
|
||
Tier 3 deactivates when the Tier 3 protocol commits a special RESUME
|
||
block, after which validators return to Tier 2. The RESUME block is
|
||
proposed when a validator has observed 2f+1 successful message exchanges
|
||
within Δ time (evidence that partial synchrony has restored).
|
||
|
||
**5.3 Protocol Design**
|
||
|
||
MESH's Tier 3 is an asynchronous DAG-based BFT protocol. It reuses the
|
||
same DAG structure as Tier 2 (same block format, same reference rules)
|
||
but replaces the deterministic commit rule with a randomised commit rule
|
||
that works under full asynchrony.
|
||
|
||
**5.3.1 Random Coin**
|
||
|
||
The fundamental difference between partially synchronous and
|
||
asynchronous BFT is the need for a random coin. Fischer, Lynch, and
|
||
Paterson (1985) proved that deterministic asynchronous consensus is
|
||
impossible with even one faulty process. The random coin breaks this
|
||
impossibility.
|
||
|
||
MESH uses a threshold random coin constructed from threshold BLS
|
||
signatures. In each round:
|
||
|
||
15. Each validator computes a partial coin share: shareᵢ = BLS_Sign(skᵢ,
|
||
round_number).
|
||
|
||
16. Once 2f+1 shares are collected, the coin is reconstructed: coin(r) =
|
||
least_significant_bit(BLS_Aggregate(shares)).
|
||
|
||
17. The coin is unpredictable: no coalition of fewer than f+1 validators
|
||
can predict the coin's value before it is revealed.
|
||
|
||
The BLS threshold signature scheme is used solely for the random coin in
|
||
Tier 3. It is not used in Tier 1 or Tier 2. This isolates the (slightly
|
||
more complex) BLS dependency to the fallback path.
|
||
|
||
> **IMPLEMENTATION NOTE:** The BLS signature scheme uses the BLS12-381
|
||
> curve. Implementations MUST use a pairing-friendly curve library that
|
||
> has been audited for correctness. The threshold DKG for BLS keys is
|
||
> performed during epoch setup alongside the Ed25519 DKG (SPEC-010).
|
||
|
||
**5.3.2 Commit Rule (Asynchronous)**
|
||
|
||
The asynchronous commit rule operates in waves. Each wave spans two
|
||
rounds of the DAG (an even round and the following odd round).
|
||
|
||
18. **Vote round (even):** In round 2k, each validator proposes a block
|
||
as normal. The leader for wave k is determined by: leader(k) =
|
||
validators\[coin(2k) mod n\], using the random coin revealed at the
|
||
start of round 2k. Note: the leader is determined after the round's
|
||
blocks are proposed, preventing a Byzantine adversary from
|
||
selectively withholding the leader's block.
|
||
|
||
19. **Decide round (odd):** In round 2k+1, validators check whether the
|
||
wave-k leader's block from round 2k was referenced by 2f+1 blocks in
|
||
round 2k. If yes, the leader block is committed (along with its
|
||
causal history). If no, the wave is skipped and the protocol
|
||
proceeds to wave k+1.
|
||
|
||
The key insight: because the leader is chosen randomly after blocks are
|
||
proposed, a Byzantine adversary cannot know which block to suppress
|
||
until it is too late. With probability at least 2/3 (when the coin
|
||
selects a correct validator), the leader's block will have been received
|
||
and referenced by a quorum, and the wave commits.
|
||
|
||
**5.3.3 Expected Latency**
|
||
|
||
Under full asynchrony, the expected number of waves until a commit is:
|
||
|
||
E\[waves\] = 1 / P(correct_leader_and_referenced) = 1 / (2/3 × 2/3) ≈
|
||
2.25 waves
|
||
|
||
Each wave is 2 rounds. Each round is bounded by the network's actual
|
||
message delivery time (not a pre-set timeout). In practice, even under
|
||
adversarial scheduling, expected commit latency is 4-6 round-trips,
|
||
approximately 1-3 seconds on continental networks.
|
||
|
||
This is significantly slower than Tier 2's 170ms but is acceptable
|
||
because Tier 3 only activates during exceptional network conditions. The
|
||
important property is that it always terminates --- there is no scenario
|
||
under the threat model where Tier 3 fails to make progress.
|
||
|
||
**5.4 Data Structures**
|
||
|
||
**5.4.1 Tier 3 Activation Message**
|
||
|
||
struct Tier3Activation {
|
||
|
||
validator: PublicKey,
|
||
|
||
last_tier2_commit: Hash, // Hash of last committed Tier 2 leader block
|
||
|
||
round: u64, // Current DAG round at activation
|
||
|
||
reason: ActivationReason, // enum: Timeout, QuorumDetected, Manual
|
||
|
||
signature: Signature,
|
||
|
||
}
|
||
|
||
**5.4.2 Coin Share**
|
||
|
||
struct CoinShare {
|
||
|
||
validator: PublicKey,
|
||
|
||
round: u64,
|
||
|
||
share: BLSSignature, // 96 bytes (BLS12-381 G2 point)
|
||
|
||
}
|
||
|
||
**5.5 Integration with Tier 2**
|
||
|
||
Tier 3 operates on the same DAG as Tier 2. When Tier 3 activates:
|
||
|
||
20. The DAG continues to grow --- blocks are still proposed and
|
||
referenced.
|
||
|
||
21. The Tier 2 commit rule is suspended (no deterministic leader
|
||
commits).
|
||
|
||
22. The Tier 3 commit rule is applied instead (random leader via coin).
|
||
|
||
23. Transactions committed by Tier 3 are added to the same ordered
|
||
history as Tier 2 commits.
|
||
|
||
24. When Tier 3 deactivates (RESUME block), the Tier 2 commit rule
|
||
resumes from the current round.
|
||
|
||
The transition is seamless: clients do not need to know which tier
|
||
committed their transaction. The settlement certificate format is
|
||
identical. The only visible difference is higher latency during the Tier
|
||
3 period.
|
||
|
||
**5.6 Formal Properties**
|
||
|
||
> **SAFETY (Tier 3):** If two correct validators commit transaction
|
||
> sequences S₁ and S₂ during Tier 3 operation, one is a prefix of the
|
||
> other. This holds under full asynchrony with f \< n/3 Byzantine
|
||
> validators. No timing assumptions are required for safety.
|
||
>
|
||
> **LIVENESS (Tier 3):** Every valid settlement submitted to a correct
|
||
> validator during Tier 3 operation is committed with probability 1
|
||
> (almost surely). The expected number of waves is at most
|
||
> 3/(2·(1-f/n)²). With n=21 and f=7, this is ≈2.25 waves.
|
||
>
|
||
> **AGREEMENT (Coin):** The threshold random coin satisfies agreement
|
||
> (all correct validators observe the same coin value), unpredictability
|
||
> (no coalition of ≤ f validators can predict the coin before 2f+1
|
||
> shares are revealed), and termination (the coin is always produced if
|
||
> 2f+1 validators are correct).
|
||
|
||
**SPEC-009: Network Transport and Peer Discovery**
|
||
|
||
**9.1 Purpose**
|
||
|
||
This specification defines the wire protocol, message framing,
|
||
connection management, peer discovery, and network-level security
|
||
mechanisms for MESH. All communication between validators, and between
|
||
clients and validators, uses the transport layer defined here.
|
||
|
||
**9.2 Transport Protocol**
|
||
|
||
MESH uses QUIC (RFC 9000) as its transport protocol. QUIC is chosen over
|
||
TCP for the following reasons:
|
||
|
||
- **Multiplexed streams:** QUIC supports multiple independent streams
|
||
within a single connection, eliminating head-of-line blocking. This
|
||
allows Tier 1 vote requests, Tier 2 DAG blocks, and Tier 3 coin
|
||
shares to flow independently without interfering with each other.
|
||
|
||
- **Built-in encryption:** QUIC mandates TLS 1.3 for all connections.
|
||
There is no unencrypted mode. This simplifies the security model.
|
||
|
||
- **Connection migration:** QUIC connections survive IP address
|
||
changes, which is important for mobile clients and cloud-based
|
||
validators that may be rescheduled.
|
||
|
||
- **0-RTT resumption:** Returning clients can send data in the first
|
||
packet, reducing latency for repeat connections.
|
||
|
||
Implementations MUST use QUIC with TLS 1.3 and MUST NOT fall back to
|
||
TCP+TLS. The TLS handshake MUST use the cipher suites specified in
|
||
SPEC-006 for the key exchange component.
|
||
|
||
**9.3 Connection Establishment**
|
||
|
||
**9.3.1 Handshake**
|
||
|
||
When a node connects to another node, the following handshake occurs
|
||
within the TLS 1.3 handshake:
|
||
|
||
25. The initiator sends a ClientHello with ALPN (Application-Layer
|
||
Protocol Negotiation) identifier \"mesh/0\".
|
||
|
||
26. The responder validates the ALPN and responds with ServerHello.
|
||
|
||
27. TLS 1.3 key exchange completes (using X25519 or hybrid
|
||
X25519+ML-KEM-768 per SPEC-006).
|
||
|
||
28. The initiator sends a MESH HandshakeMessage containing: protocol
|
||
version, supported cipher suites (ordered by preference), node type
|
||
(validator/client/connector), node public key, and a signature
|
||
proving ownership of the public key.
|
||
|
||
29. The responder validates the HandshakeMessage and responds with its
|
||
own HandshakeMessage.
|
||
|
||
30. Both sides verify: (a) the remote node's public key is valid, (b) if
|
||
the remote claims to be a validator, its public key is in the
|
||
current epoch's validator set, (c) the signature is valid.
|
||
|
||
If any check fails, the connection MUST be closed with an error code. No
|
||
protocol messages are exchanged before the handshake completes.
|
||
|
||
**9.3.2 HandshakeMessage Structure**
|
||
|
||
struct HandshakeMessage {
|
||
|
||
version: u8, // Protocol major version
|
||
|
||
cipher_suites: Vec\<u8\>, // Supported cipher suite IDs, preference
|
||
order
|
||
|
||
node_type: NodeType, // enum: Validator, Client, Connector
|
||
|
||
public_key: PublicKey, // Node's identity key
|
||
|
||
epoch: u64, // Node's current epoch number
|
||
|
||
features: u64, // Bitmask of supported optional features
|
||
|
||
timestamp: u64, // Current time (used to reject stale handshakes)
|
||
|
||
signature: Signature, // Sign(identity_key, all_above_fields)
|
||
|
||
}
|
||
|
||
**9.4 Message Framing**
|
||
|
||
All MESH protocol messages are encoded as length-prefixed frames on a
|
||
QUIC stream:
|
||
|
||
\[length: u32\] \[message_type: u8\] \[payload: bytes\]
|
||
|
||
The length field includes the message_type byte and the payload. Maximum
|
||
frame size is 4 MB (4,194,304 bytes). Messages larger than this MUST be
|
||
fragmented at the application layer.
|
||
|
||
**9.4.1 Message Types**
|
||
|
||
--------- ------------------------ --------------------------------------------
|
||
**Type **Name** **Description**
|
||
ID**
|
||
|
||
0x01 VOTE_REQUEST Tier 1: Client → Validator. Contains a
|
||
Transition.
|
||
|
||
0x02 VOTE_RESPONSE Tier 1: Validator → Client. Contains a
|
||
ValidatorVote or rejection.
|
||
|
||
0x03 SETTLEMENT_CERTIFICATE Tier 1: Client → Validator. Contains a
|
||
SettlementCertificate.
|
||
|
||
0x04 EQUIVOCATION_PROOF Any → Any. Contains an EquivocationProof.
|
||
|
||
0x10 DAG_BLOCK Tier 2: Validator → Validator. Contains a
|
||
DAGBlock.
|
||
|
||
0x11 BLOCK_REQUEST Tier 2: Validator → Validator. Request a
|
||
missing block by hash.
|
||
|
||
0x12 BLOCK_RESPONSE Tier 2: Validator → Validator. Response with
|
||
requested block.
|
||
|
||
0x20 TIER3_ACTIVATION Tier 3: Validator → Validator.
|
||
Tier3Activation message.
|
||
|
||
0x21 COIN_SHARE Tier 3: Validator → Validator. CoinShare for
|
||
random coin.
|
||
|
||
0x30 PACKET_PREPARE Routing: Connector → Connector. MeshPacket
|
||
(Prepare type).
|
||
|
||
0x31 PACKET_FULFILL Routing: Connector → Connector. MeshPacket
|
||
(Fulfill type).
|
||
|
||
0x32 PACKET_REJECT Routing: Connector → Connector. MeshPacket
|
||
(Reject type).
|
||
|
||
0x40 HANDSHAKE Connection setup. HandshakeMessage.
|
||
|
||
0x41 PING Keepalive. Empty payload.
|
||
|
||
0x42 PONG Keepalive response. Empty payload.
|
||
|
||
0xF0 QUERY_ACCOUNT Client → Validator. Request current account
|
||
state.
|
||
|
||
0xF1 QUERY_RESPONSE Validator → Client. Response with account
|
||
state.
|
||
|
||
0xFF ERROR Any → Any. Protocol error with error code
|
||
and description.
|
||
--------- ------------------------ --------------------------------------------
|
||
|
||
**9.5 Stream Multiplexing**
|
||
|
||
MESH uses dedicated QUIC streams for different message categories to
|
||
prevent interference:
|
||
|
||
--------------- ----------------------- -------------------------------
|
||
**Stream ID **Purpose** **Priority**
|
||
Range**
|
||
|
||
0-3 Handshake and keepalive Highest. Must not be blocked by
|
||
(PING/PONG) data streams.
|
||
|
||
4-7 Tier 1 messages High. Retail payment
|
||
(VOTE_REQUEST, latency-critical.
|
||
VOTE_RESPONSE,
|
||
CERTIFICATE)
|
||
|
||
8-11 Tier 2 DAG messages Medium. Consensus throughput.
|
||
(DAG_BLOCK,
|
||
BLOCK_REQUEST,
|
||
BLOCK_RESPONSE)
|
||
|
||
12-15 Tier 3 messages Medium. Only active during
|
||
(ACTIVATION, fallback.
|
||
COIN_SHARE)
|
||
|
||
16-19 Routing messages Normal. Value routing.
|
||
(PACKET_PREPARE,
|
||
FULFILL, REJECT)
|
||
|
||
20-23 Query messages Low. Client state queries.
|
||
(QUERY_ACCOUNT,
|
||
QUERY_RESPONSE)
|
||
--------------- ----------------------- -------------------------------
|
||
|
||
QUIC stream priorities SHOULD be configured to match the priority
|
||
column. Under congestion, Tier 1 messages SHOULD be prioritised over
|
||
Tier 2, which SHOULD be prioritised over routing, which SHOULD be
|
||
prioritised over queries.
|
||
|
||
**9.6 Peer Discovery**
|
||
|
||
**9.6.1 Validator Discovery**
|
||
|
||
Validators discover each other through a static bootstrap list plus a
|
||
gossip protocol:
|
||
|
||
31. **Bootstrap:** Each validator is configured with a list of at least
|
||
3 bootstrap nodes (addresses and public keys). These are hardcoded
|
||
for the genesis epoch and updated via governance for subsequent
|
||
epochs.
|
||
|
||
32. **Gossip:** Upon connecting to a bootstrap node, the validator
|
||
requests the current validator set (public keys and network
|
||
addresses). Validators gossip peer information with a pull-based
|
||
protocol: every 30 seconds, a validator requests the peer list from
|
||
a randomly chosen connected peer.
|
||
|
||
33. **Verification:** All validator addresses are signed by the
|
||
validator's identity key. A validator MUST verify the signature
|
||
before connecting to a new peer. This prevents man-in-the-middle
|
||
injection of fake validator addresses.
|
||
|
||
**9.6.2 Client Discovery**
|
||
|
||
Clients discover validators through:
|
||
|
||
- **Well-known URI:** Each MESH network publishes a validator list at
|
||
a well-known HTTPS URL (e.g.,
|
||
https://mesh-mainnet.org/.well-known/mesh-validators.json). The JSON
|
||
document contains: epoch number, validator public keys, network
|
||
addresses, and is signed by the governance multisig key.
|
||
|
||
- **DNS:** Validator addresses are published as DNS SRV records under
|
||
\_mesh.\_quic.\<network\>.mesh-protocol.org. This provides a
|
||
fallback if the HTTPS endpoint is unavailable.
|
||
|
||
- **QR code:** For initial onboarding, a validator's connection
|
||
details can be encoded in a QR code for scanning by a mobile wallet.
|
||
|
||
**9.7 Eclipse Attack Resistance**
|
||
|
||
An eclipse attack occurs when an adversary surrounds a target node with
|
||
malicious peers, controlling all of the target's connections. MESH
|
||
mitigates eclipse attacks through:
|
||
|
||
34. **Minimum connection diversity:** Each validator MUST maintain
|
||
connections to at least 2f+1 distinct validators. If the connection
|
||
count drops below this, the validator MUST attempt to reconnect
|
||
using the bootstrap list.
|
||
|
||
35. **Connection rotation:** Every epoch, each validator drops its
|
||
lowest-quality connection (highest latency or most dropped messages)
|
||
and replaces it with a randomly selected validator from the current
|
||
set that it is not yet connected to.
|
||
|
||
36. **Multi-source verification:** Before accepting any protocol state
|
||
(epoch changes, validator set updates), a validator MUST receive
|
||
consistent information from at least f+1 independent sources.
|
||
|
||
37. **Inbound connection limits:** Each validator MUST limit inbound
|
||
connections to max(3n, 256) total, with per-IP and per-subnet limits
|
||
of 8 and 32 respectively. This prevents a single adversary from
|
||
monopolising connection slots.
|
||
|
||
**9.8 Rate Limiting**
|
||
|
||
To prevent resource exhaustion attacks:
|
||
|
||
- **Per-account rate limit:** A validator MUST process at most
|
||
R_account Tier 1 vote requests per account per second (recommended:
|
||
R_account = 10). Excess requests are queued up to Q_max
|
||
(recommended: 100) and then rejected with ERR_RATE_LIMITED.
|
||
|
||
- **Per-connection rate limit:** A validator MUST process at most
|
||
R_connection messages per connection per second across all types
|
||
(recommended: R_connection = 1000). This prevents a single malicious
|
||
connection from overwhelming the validator.
|
||
|
||
- **Global rate limit:** A validator MUST limit total incoming message
|
||
bandwidth to B_max bytes per second (recommended: B_max = the
|
||
validator's network capacity × 0.8, leaving 20% headroom for
|
||
outbound traffic).
|
||
|
||
- **New account cost:** Creating a new account (first transition)
|
||
requires a proof-of-work puzzle with difficulty D_new (recommended:
|
||
D_new = 2²⁰, approximately 1 second of single-core computation).
|
||
This prevents mass account creation attacks without requiring an
|
||
economic cost in any specific currency.
|
||
|
||
**9.9 Keepalive**
|
||
|
||
Each connection MUST send a PING message at least every 30 seconds if no
|
||
other messages have been sent. The remote node MUST respond with PONG
|
||
within 5 seconds. If no PONG is received after 3 consecutive PINGs, the
|
||
connection MUST be closed and the peer marked as potentially
|
||
unreachable.
|
||
|
||
PING/PONG messages carry no payload and are 5 bytes each (4 byte
|
||
length + 1 byte type). They serve dual purposes: keeping QUIC
|
||
connections alive through NAT/firewalls and detecting peer failures.
|
||
|
||
**Document Control**
|
||
|
||
**Revision History**
|
||
|
||
------------- ------------ --------------- -------------------------------
|
||
**Version** **Date** **Author** **Description**
|
||
|
||
0.1.0 2026-03-09 MESH Foundation Initial draft. Completes
|
||
specification suite with
|
||
SPEC-004, 005, 009.
|
||
------------- ------------ --------------- -------------------------------
|
||
|
||
**Additional References**
|
||
|
||
- Babel, K. et al. (2023). Mysticeti: Reaching the Limits of Latency
|
||
with Uncertified DAGs. arXiv:2310.14821.
|
||
|
||
- Jovanovic, P. et al. (2024). Mahi-Mahi: Low-Latency Asynchronous BFT
|
||
DAG-Based Consensus. arXiv:2410.08670.
|
||
|
||
- Arun, B. et al. (2024). Shoal++: High Throughput DAG BFT Can Be Fast
|
||
and Robust. USENIX NSDI 2025.
|
||
|
||
- Keidar, I. et al. (2021). All You Need is DAG. ACM PODC 2021
|
||
(DAGRider).
|
||
|
||
- Danezis, G. et al. (2022). Narwhal and Tusk: A DAG-based Mempool and
|
||
Efficient BFT Consensus. EuroSys 2022.
|
||
|
||
- Spiegelman, A. et al. (2022). Bullshark: DAG BFT Protocols Made
|
||
Practical. ACM CCS 2022.
|
||
|
||
- Fischer, M., Lynch, N., Paterson, M. (1985). Impossibility of
|
||
Distributed Consensus with One Faulty Process. JACM.
|
||
|
||
- RFC 9000: QUIC: A UDP-Based Multiplexed and Secure Transport. IETF,
|
||
2021.
|
||
|
||
- RFC 8446: The Transport Layer Security (TLS) Protocol Version 1.3.
|
||
IETF, 2018.
|