Skip to content

Sync Protocol

The sync protocol ensures nodes converge on the same ledger state by exchanging missing blocks. It uses a frontier-based approach: each node tells its peer what it already has, and the peer sends back what it is missing.

Protocol ID: /xe/sync/1.0.0

Workflow

  Client                                  Server
  ──────                                  ──────
     │                                       │
     │  SyncRequest (frontiers, page_size)   │
     │ ─────────────────────────────────────▶│
     │                                       │
     │                          Compare frontiers
     │                          Walk chains for
     │                          missing blocks
     │                                       │
     │  SyncResponse (blocks[], has_more)    │
     │◀───────────────────────────────────── │
     │                                       │
     │  SyncResponse (blocks[], has_more)    │
     │◀───────────────────────────────────── │
     │                                       │
     │  SyncResponse (blocks[], has_more=f)  │
     │◀───────────────────────────────────── │
     │                                       │
     │  Validate and add blocks to ledger    │
     │                                       │
  1. Client sends frontiers -- A map of account -> latest block hash representing its current view of the ledger, plus a requested page size.
  2. Server compares frontiers -- For each account, the server determines which blocks the client is missing.
  3. Server streams pages -- Missing blocks are sent in paginated responses. Each page contains up to pageSize blocks and a HasMore flag.
  4. Client receives and validates -- The client collects all blocks, then adds them to the ledger via AddSyncedBlock() with retry logic for cross-account dependencies. AddSyncedBlock skips the timestamp window check (which would reject historical blocks) while keeping all other validation (signatures, PoW, balances, chain integrity).
  5. Stream closes -- After the final page (HasMore=false), the stream ends.

Wire Types

SyncRequest

type SyncRequest struct {
    Frontiers map[string]string `json:"frontiers"` // account → frontier block hash
    PageSize  int               `json:"page_size"`
}

SyncResponse

type SyncResponse struct {
    Blocks  []*core.Block `json:"blocks"`
    HasMore bool          `json:"has_more"`
    Cursor  string        `json:"cursor"` // hash of last block sent
}

Constants

Constant Value Description
defaultPageSize 64 Blocks per page if not specified
maxPageSize 256 Maximum allowed page size
maxTotalBlocks 10,000 Client-side cap on blocks per sync session
maxServerBlocks 10,000 Server-side cap on blocks per sync session
syncCooldown 5 seconds Minimum interval between syncs with the same peer
periodicSyncInterval 10 seconds How often nodes re-sync with all connected peers
maxSyncRequestBytes 1 MiB Maximum size of an incoming SyncRequest
maxSyncResponseBytes 10 MiB Maximum size of incoming SyncResponse pages
maxFrontiers 10,000 Maximum frontier entries in a single request

Triggers

Sync is triggered in three ways:

1. On Peer Connection

When a new peer connects, the node immediately initiates a sync:

h.Network().Notify(&network.NotifyBundle{
    ConnectedF: func(n network.Network, conn network.Conn) {
        go requestSync(h, conn.RemotePeer(), ledger)
    },
})

2. Periodic Re-Sync (with frontier tracking)

A background goroutine checks peers every 10 seconds, but only syncs when something has changed. A SyncTracker records the frontiers last sent to each peer and a dirty flag that is set when a block is added locally (via gossip, API, or sync):

  • Dirty flag set: sync all peers on the next tick, then clear the flag
  • Dirty flag clear: skip all peers (frontiers haven't changed)
  • Full resync: forced every 60 seconds as a safety net regardless of dirty state

This eliminates the constant stream-open/close chatter when the network is idle. With 5 peers and a clean ledger, the node produces zero sync traffic between the 60-second safety ticks.

ticker := time.NewTicker(periodicSyncInterval) // 10s
for range ticker.C {
    for _, pid := range h.Network().Peers() {
        if tracker.shouldSync(pid, currentFrontiers) {
            go requestSync(h, pid, ledger, quarantine)
        }
    }
}

3. Incoming Sync Requests

The node also serves sync requests from other peers via the stream handler registered at /xe/sync/1.0.0.

Rate Limiting

Both inbound and outbound sync are rate-limited per peer with separate syncRateLimiter instances:

Direction Limiter Cooldown
Inbound (server) inboundRL 5 seconds per peer
Outbound (client) outboundRL 5 seconds per peer

The rate limiter tracks the last sync timestamp per peer ID. Expired entries are evicted on each check to prevent unbounded memory growth.

func (rl *syncRateLimiter) allow(pid peer.ID) bool

Why Rate Limit?

Without rate limiting, the periodic 10-second re-sync combined with peer connection events could cause excessive sync traffic, especially in large networks. The 5-second cooldown ensures at most one sync per peer per direction every 5 seconds.

Server-Side Logic

The server handles an incoming sync stream as follows:

  1. Decode request -- Read and JSON-decode the SyncRequest from a size-limited reader (maxSyncRequestBytes)
  2. Cap frontiers -- If the request contains more than maxFrontiers entries, excess entries are silently trimmed
  3. Clamp page size -- Page size is clamped to [1, maxPageSize]
  4. Walk chains -- For each account in the server's ledger:
    • If the client has no frontier for the account, send the entire chain
    • If the client's frontier matches the server's, skip (already in sync)
    • If the client's frontier is behind, send blocks after the frontier
    • If the client's frontier is unrecognized, skip the account (prevents amplification attacks)
  5. Send pages -- Blocks are grouped into pages of pageSize and streamed as JSON-encoded SyncResponse objects

Unrecognized Frontiers

If a peer claims a frontier hash that doesn't exist in the server's chain for that account, the account is skipped entirely. This prevents a bandwidth amplification attack where a malicious peer sends fake frontier hashes to receive full account chains. The peer can still get the full chain by omitting the account from its frontier map.

Client-Side Logic

The client side of a sync:

  1. Open stream -- Create a new stream to the target peer with a 60-second timeout
  2. Send frontiers -- JSON-encode SyncRequest with the ledger's current frontiers
  3. Close write -- Signal to the server that the request is complete
  4. Read pages -- Decode SyncResponse objects until HasMore=false or EOF, accumulating blocks up to maxTotalBlocks
  5. Add blocks with retry -- Blocks are added to the ledger via AddSyncedBlock() in multiple passes (up to 3 retries) to handle cross-account dependencies. AddSyncedBlock bypasses the timestamp window check — synced blocks are historical data that was already validated when originally published. All other validation (signatures, PoW, balances, chain integrity) still applies.

Cross-Account Dependencies

A receive block references a send block from a different account. If the send block arrives later in the sync stream, the receive block fails to validate. The retry mechanism handles this:

Pass 1: Add all blocks → some receive blocks fail (send not yet in ledger)
Pass 2: Retry failed blocks → most succeed (sends now in ledger)
Pass 3: Final retry → remaining edge cases

Retryable errors include:

Error Pattern Meaning
previous block not found Block's Previous not yet in chain
source send not pending Receive arrived before its send
not found Generic dependency missing
frontier mismatch Chain tip changed between attempts
unresolved conflict Account locked during conflict resolution

If no progress is made in a retry pass (same number of failures), retrying stops.

Block Quarantine

Blocks that fail with non-retryable errors (invalid signature, wrong hash, missing attestations) are added to an in-memory quarantine set. On subsequent sync rounds, quarantined blocks are skipped entirely — no validation attempt, no log output. This prevents repeated "block rejected" log spam from permanently-invalid blocks (e.g., blocks from a previous network epoch).

The quarantine resets on node restart, which is correct — new state after restart may make previously invalid blocks valid.

Security Considerations

Frontier Privacy

The sync protocol reveals the full frontier set to the responding peer. A malicious peer can learn which accounts exist and their current block heights. Future improvement: use a bloom filter or frontier hash instead of sending the full frontier map.

Mitigations in place:

  • Request size limit (1 MiB) -- Prevents OOM from huge frontier maps
  • Response size limit (10 MiB) -- Prevents OOM from oversized responses
  • Frontier cap (10,000) -- Limits frontier entries per request
  • Block caps (10,000 client + server) -- Prevents CPU/memory exhaustion from chain walking
  • Rate limiting (5s cooldown) -- Prevents sync flooding
  • Stream deadline (60s) -- Prevents hung connections
  • Amplification resistance -- Unrecognized frontiers are skipped, not replied with full chains

Relationship to Gossip

Sync and gossip are complementary:

Gossip Sync
Delivery Best-effort broadcast Reliable catch-up
Latency Real-time Periodic (10s) + on-connect
Scope Individual blocks Entire ledger delta
Direction Push Pull

Gossip handles the fast path -- new blocks propagate in near-real-time. Sync handles the slow path -- catching up after downtime, missed messages, or network partitions.