Sync Protocol¶

The sync protocol ensures nodes converge on the same ledger state by exchanging missing blocks. It uses a frontier-based approach: each node tells its peer what it already has, and the peer sends back what it is missing.

Protocol ID: /xe/sync/1.0.0

Workflow¶

  Client                                  Server
  ──────                                  ──────
     │                                       │
     │  SyncRequest (frontiers, page_size)   │
     │ ─────────────────────────────────────▶│
     │                                       │
     │                          Compare frontiers
     │                          Walk chains for
     │                          missing blocks
     │                                       │
     │  SyncResponse (blocks[], has_more)    │
     │◀───────────────────────────────────── │
     │                                       │
     │  SyncResponse (blocks[], has_more)    │
     │◀───────────────────────────────────── │
     │                                       │
     │  SyncResponse (blocks[], has_more=f)  │
     │◀───────────────────────────────────── │
     │                                       │
     │  Validate and add blocks to ledger    │
     │                                       │

Client sends frontiers -- A map of account -> latest block hash representing its current view of the ledger, plus a requested page size.
Server compares frontiers -- For each account, the server determines which blocks the client is missing.
Server streams pages -- Missing blocks are sent in paginated responses. Each page contains up to pageSize blocks and a HasMore flag.
Client receives and validates -- The client collects all blocks, then adds them to the ledger via AddSyncedBlock() with retry logic for cross-account dependencies. AddSyncedBlock skips the timestamp window check (which would reject historical blocks) while keeping all other validation (signatures, PoW, balances, chain integrity).
Stream closes -- After the final page (HasMore=false), the stream ends.

Wire Types¶

SyncRequest¶

type SyncRequest struct {
    Frontiers map[string]string `json:"frontiers"` // account → frontier block hash
    PageSize  int               `json:"page_size"`
}

SyncResponse¶

type SyncResponse struct {
    Blocks  []*core.Block `json:"blocks"`
    HasMore bool          `json:"has_more"`
    Cursor  string        `json:"cursor"` // hash of last block sent
}

Constants¶

Constant	Value	Description
`defaultPageSize`	64	Blocks per page if not specified
`maxPageSize`	256	Maximum allowed page size
`maxTotalBlocks`	10,000	Client-side cap on blocks per sync session
`maxServerBlocks`	10,000	Server-side cap on blocks per sync session
`syncCooldown`	5 seconds	Minimum interval between syncs with the same peer
`periodicSyncInterval`	10 seconds	How often nodes re-sync with all connected peers
`maxSyncRequestBytes`	1 MiB	Maximum size of an incoming SyncRequest
`maxSyncResponseBytes`	10 MiB	Maximum size of incoming SyncResponse pages
`maxFrontiers`	10,000	Maximum frontier entries in a single request

Triggers¶

Sync is triggered in three ways:

1. On Peer Connection¶

When a new peer connects, the node immediately initiates a sync:

h.Network().Notify(&network.NotifyBundle{
    ConnectedF: func(n network.Network, conn network.Conn) {
        go requestSync(h, conn.RemotePeer(), ledger)
    },
})

2. Periodic Re-Sync (with frontier tracking)¶

A background goroutine checks peers every 10 seconds, but only syncs when something has changed. A SyncTracker records the frontiers last sent to each peer and a dirty flag that is set when a block is added locally (via gossip, API, or sync):

Dirty flag set: sync all peers on the next tick, then clear the flag
Dirty flag clear: skip all peers (frontiers haven't changed)
Full resync: forced every 60 seconds as a safety net regardless of dirty state

This eliminates the constant stream-open/close chatter when the network is idle. With 5 peers and a clean ledger, the node produces zero sync traffic between the 60-second safety ticks.

ticker := time.NewTicker(periodicSyncInterval) // 10s
for range ticker.C {
    for _, pid := range h.Network().Peers() {
        if tracker.shouldSync(pid, currentFrontiers) {
            go requestSync(h, pid, ledger, quarantine)
        }
    }
}

3. Incoming Sync Requests¶

The node also serves sync requests from other peers via the stream handler registered at /xe/sync/1.0.0.

Rate Limiting¶

Both inbound and outbound sync are rate-limited per peer with separate syncRateLimiter instances:

Direction	Limiter	Cooldown
Inbound (server)	`inboundRL`	5 seconds per peer
Outbound (client)	`outboundRL`	5 seconds per peer

The rate limiter tracks the last sync timestamp per peer ID. Expired entries are evicted on each check to prevent unbounded memory growth.

func (rl *syncRateLimiter) allow(pid peer.ID) bool

Why Rate Limit?

Without rate limiting, the periodic 10-second re-sync combined with peer connection events could cause excessive sync traffic, especially in large networks. The 5-second cooldown ensures at most one sync per peer per direction every 5 seconds.

Server-Side Logic¶

The server handles an incoming sync stream as follows:

Decode request -- Read and JSON-decode the SyncRequest from a size-limited reader (maxSyncRequestBytes)
Cap frontiers -- If the request contains more than maxFrontiers entries, excess entries are silently trimmed
Clamp page size -- Page size is clamped to [1, maxPageSize]
Walk chains -- For each account in the server's ledger:
- If the client has no frontier for the account, send the entire chain
- If the client's frontier matches the server's, skip (already in sync)
- If the client's frontier is behind, send blocks after the frontier
- If the client's frontier is unrecognized, skip the account (prevents amplification attacks)
Send pages -- Blocks are grouped into pages of pageSize and streamed as JSON-encoded SyncResponse objects

Unrecognized Frontiers

If a peer claims a frontier hash that doesn't exist in the server's chain for that account, the account is skipped entirely. This prevents a bandwidth amplification attack where a malicious peer sends fake frontier hashes to receive full account chains. The peer can still get the full chain by omitting the account from its frontier map.

Client-Side Logic¶

The client side of a sync:

Open stream -- Create a new stream to the target peer with a 60-second timeout
Send frontiers -- JSON-encode SyncRequest with the ledger's current frontiers
Close write -- Signal to the server that the request is complete
Read pages -- Decode SyncResponse objects until HasMore=false or EOF, accumulating blocks up to maxTotalBlocks
Add blocks with retry -- Blocks are added to the ledger via AddSyncedBlock() in multiple passes (up to 3 retries) to handle cross-account dependencies. AddSyncedBlock bypasses the timestamp window check — synced blocks are historical data that was already validated when originally published. All other validation (signatures, PoW, balances, chain integrity) still applies.

Cross-Account Dependencies¶

A receive block references a send block from a different account. If the send block arrives later in the sync stream, the receive block fails to validate. The retry mechanism handles this:

Pass 1: Add all blocks → some receive blocks fail (send not yet in ledger)
Pass 2: Retry failed blocks → most succeed (sends now in ledger)
Pass 3: Final retry → remaining edge cases

Retryable errors include:

Error Pattern	Meaning
`previous block not found`	Block's `Previous` not yet in chain
`source send not pending`	Receive arrived before its send
`not found`	Generic dependency missing
`frontier mismatch`	Chain tip changed between attempts
`unresolved conflict`	Account locked during conflict resolution

If no progress is made in a retry pass (same number of failures), retrying stops.

Block Quarantine¶

Blocks that fail with non-retryable errors (invalid signature, wrong hash, missing attestations) are added to an in-memory quarantine set. On subsequent sync rounds, quarantined blocks are skipped entirely — no validation attempt, no log output. This prevents repeated "block rejected" log spam from permanently-invalid blocks (e.g., blocks from a previous network epoch).

The quarantine resets on node restart, which is correct — new state after restart may make previously invalid blocks valid.

Security Considerations¶

Frontier Privacy

The sync protocol reveals the full frontier set to the responding peer. A malicious peer can learn which accounts exist and their current block heights. Future improvement: use a bloom filter or frontier hash instead of sending the full frontier map.

Mitigations in place:

Request size limit (1 MiB) -- Prevents OOM from huge frontier maps
Response size limit (10 MiB) -- Prevents OOM from oversized responses
Frontier cap (10,000) -- Limits frontier entries per request
Block caps (10,000 client + server) -- Prevents CPU/memory exhaustion from chain walking
Rate limiting (5s cooldown) -- Prevents sync flooding
Stream deadline (60s) -- Prevents hung connections
Amplification resistance -- Unrecognized frontiers are skipped, not replied with full chains

Relationship to Gossip¶

Sync and gossip are complementary:

	Gossip	Sync
Delivery	Best-effort broadcast	Reliable catch-up
Latency	Real-time	Periodic (10s) + on-connect
Scope	Individual blocks	Entire ledger delta
Direction	Push	Pull

Gossip handles the fast path -- new blocks propagate in near-real-time. Sync handles the slow path -- catching up after downtime, missed messages, or network partitions.