Skip to content

feat(tonic-xds): add gRFC A42 ring-hash picker + member tracking#2695

Open
madhurishgupta wants to merge 1 commit into
grpc:masterfrom
madhurishgupta:madhurishgupta/a42-pr2-ring-hash-picker
Open

feat(tonic-xds): add gRFC A42 ring-hash picker + member tracking#2695
madhurishgupta wants to merge 1 commit into
grpc:masterfrom
madhurishgupta:madhurishgupta/a42-pr2-ring-hash-picker

Conversation

@madhurishgupta

Copy link
Copy Markdown
Contributor

Summary

Adds the gRFC A42 ring-hash load-balancing picker to the loadbalance stack, giving consistent-hash request affinity: requests carrying the same hash key are routed to the same backend.

What's included

  • RingHashPicker (pickers/ring_hash.rs)
    • Builds a hash ring over the cluster's full healthy-EDS membership with uniform per-member weighting.
    • Entries keyed xxh64("{addr}_{i}", 0) (XXH64, seed 0); the ring is held lock-free behind an ArcSwap and rebuilt only on membership change.
    • pick() reads RouteDecision.request_hash (per-request random fallback when absent), finds the ring position closest to that hash (first entry with hash ≥ request, wrapping), and walks clockwise to the first ready host — returning Unavailable if no ring host is ready.
  • LoadBalancer member tracking: tracks the full healthy-EDS member set (independent of connection/ejection state), rebuilding the picker's ring once per discovery drain.

Behavior notes

  • Outlier detection composes with no ring-hash-specific code. The ring is built over members; picks resolve against the ready set. An ejected host stays in the ring but is absent from ready, so its keys fall through clockwise to the next ready host.

Testing

Added UTs.
cargo fmt, clippy, and cargo test -p tonic-xds all clean.

##nPlan (A42 series)

  • Last PR: Request-hash computation + plumbing. (feat(tonic-xds): compute gRFC A42 request hash from header hash policy #2686)
  • This PR: Add gRFC A42 ring-hash picker + member tracking
  • Third PR: CDS wiring — parse lb_policy: RING_HASH and ring_hash_lb_config (validating hash_function == XX_HASH), and select the ring-hash picker from the cluster's LB policy.
  • Fourth PR: RDS wiring — parse RouteAction.hash_policy to populate the policy list (replacing the empty scaffold here).

Implements the ring-hash LB picker on the loadbalance/ stack, with ring
construction and the hash-position walk mirroring grpc-go's ringhash balancer:

- RingHashPicker: builds an A42-conformant ring (uniform per-member weighting) —
  size = smallest multiple of N >= min_ring_size, clamped to max_ring_size;
  entries keyed xxh64("{addr}_{i}", 0); ring held lock-free behind ArcSwap.
  pick() reads RouteDecision.request_hash (per-request random fallback), finds
  the ring position closest to that hash and walks clockwise to the first ready
  host (None/Unavailable if no ring host is ready).
- ChannelPicker::on_members_changed hook (default no-op; P2C inherits it),
  delegating to RingHashPicker::rebuild.
- LoadBalancer tracks `members` (full healthy-EDS set, independent of
  connection/ejection state) and rebuilds the picker's ring once per discovery
  drain. Outlier detection composes for free: ejected hosts stay in the ring
  but are not picked (not in `ready`).

Currently it supports uniform weighting and an eager-connect pick that selects
the first ready host. The remaining A42 connection semantics — IDLE-start with
connect-on-pick, queuing while CONNECTING, the TRANSIENT_FAILURE-aware walk,
weight-proportional rings, and aggregated-connectivity-state rules — are gated
on the load balancer's connection model and deferred. The picker is not yet
selected by lb_policy (default-wired in a later change).

Tests: 16 picker unit tests + 1 LoadBalancer member-tracking integration test.
@madhurishgupta madhurishgupta force-pushed the madhurishgupta/a42-pr2-ring-hash-picker branch from f9b2dcc to 575ebed Compare June 19, 2026 00:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant