Skip to content

Improve sync durability: health status API, degraded callbacks, flush guarantee #119

@khaliqgant

Description

@khaliqgant

Problem

Relayfile's WebSocket sync layer has several durability gaps that matter for production agent workloads:

  1. No health visibility — callers have no way to know if the sync connection is degraded (polling fallback, stale, reconnecting) versus healthy. Agents can silently read stale data.

  2. Ping interval too long — default 30s ping interval means a broken connection takes up to 30s to detect. For a real-time coordination layer this is too long.

  3. No degraded/recovered callbacks — no way for the caller to react when the connection enters or exits a degraded state (e.g., to surface a warning in UI or pause agent writes).

  4. No flush guarantee on local-mountAutoSyncHandle has no way to drain pending debounced writes before shutdown or before a critical operation. Files can be in the 50ms debounce window and dropped if the process exits.

  5. No watcher health signal — if the @parcel/watcher subscription fails silently, the mount appears healthy but local changes stop being detected.

Context

These gaps were surfaced during a comparison with mirage's architecture. Mirage is a pull-on-demand VFS library (no daemon, no WebSocket) — it trades real-time push for simplicity. Relayfile's push-first model is a genuine advantage for multi-agent coordination, but only if the push channel is reliable and observable.

Changes (in PR)

  • packages/sdk/typescript/src/sync.ts

    • Halve DEFAULT_PING_INTERVAL_MS from 30000 → 15000
    • Add RelayFileSyncHealthStatus interface with degraded, degradedReason, stateEnteredAt, lastFrameAt, reconnectAttempts
    • Add onDegraded and onRecovered callbacks to RelayFileSyncOptions
    • Add getHealthStatus() public method
    • Fire onDegraded when polling fallback activates; fire onRecovered on successful WebSocket reconnect
  • packages/local-mount/src/auto-sync.ts

    • Add flushPending(opts?) — drains all debounced writes and runs a full reconcile, returns count of files flushed
    • Add watchersHealthy() — returns true only when both mount and project watchers are successfully subscribed
  • packages/sdk/typescript/src/index.ts

    • Export RelayFileSyncHealthStatus type

Impact

Agents and orchestrators can now:

  • Query sync health before reading critical files
  • React to degraded state (pause writes, show warning, switch to safe mode)
  • Flush pending writes before shutdown or handoff
  • Verify watcher subscriptions are alive before trusting local state

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions