test(consensus): reproduce fast_forward_sync num_blocks u64 underflow (audit #705)#742
Open
keanji-x wants to merge 1 commit into
Open
test(consensus): reproduce fast_forward_sync num_blocks u64 underflow (audit #705)#742keanji-x wants to merge 1 commit into
keanji-x wants to merge 1 commit into
Conversation
…#705) F3 of galxe/RESTART_RECOVERY_FINDINGS.md. `BlockStore::fast_forward_sync` (aptos-core/consensus/src/block_storage/sync_manager.rs:509-514) computes: let num_blocks = highest_quorum_cert.certified_block().round() - highest_commit_cert.ledger_info().ledger_info().round() + 1; assert!(num_blocks < std::usize::MAX as u64); // does NOT catch underflow on 64-bit `highest_commit_cert` is `effective_commit_cert = max(local_hcc, remote_hcc)` by round (sync_to_highest_quorum_cert, :358). When the local commit root is AHEAD of the incoming SyncInfo's hqc certified round, `hqc_round - commit_round + 1` underflows to ~u64::MAX. The guard only handled 32-bit overflow; on 64-bit usize::MAX == u64::MAX so the wrapped value passes the assert and an absurd count is handed to retrieve_blocks_in_range (debug: panic; release: wrap + giant fetch -> sync stall). Adds a #[cfg(test)] module (test code only, no production-code edits) that builds the exact same QuorumCert / WrappedLedgerInfo objects with commit_round (100) > hqc_round (90) and evaluates the identical num_blocks expression. Evidence (cargo test -p aptos-consensus, profile has overflow checks on): test ...::ffsync_num_blocks_underflows_when_commit_ahead_of_hqc ... thread '...' panicked at sync_manager.rs:1446:26: attempt to subtract with overflow FAILED test ...::ffsync_num_blocks_ok_when_hqc_ahead_of_commit ... ok The underflow test FAILS on current code (overflow panic at the production expression); it would PASS once a checked_sub + early-return guard (commit_round >= hqc_round) is added. The happy-path test guards against an over-broad fix. Refs Galxe/gravity-audit#705 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Reproduction (test-only PR) for gravity-audit#705 — F3 in
galxe/RESTART_RECOVERY_FINDINGS.md: au64underflow ofnum_blocksinBlockStore::fast_forward_sync.The call site at
aptos-core/consensus/src/block_storage/sync_manager.rs:509-514:highest_commit_certiseffective_commit_cert = max(local_hcc, remote_hcc)by round (the gravity-specific change at:358). When the local commit root is ahead of the incoming SyncInfo's hqc certified round (a stale / forked SyncInfo whose hqc block the node does not have),hqc_round - commit_round + 1underflows to ~u64::MAX. The guard only considered 32-bit overflow; on 64-bitusize::MAX == u64::MAX, so the wrapped value still satisfies< usize::MAXand the assert passes, then the absurd count is handed toretrieve_blocks_in_range(.., num_blocks, ..):attempt to subtract with overflowWhat this PR adds
Test code only — no production-code edits. A
#[cfg(test)] mod ffsync_numblocks_underflow_705_testsappended tosync_manager.rsthat builds the exact sameQuorumCert/WrappedLedgerInfoobjects the production path operates on (usinggen_test_certificate+into_wrapped_ledger_info), withcommit_round = 100 > hqc_round = 90, and evaluates the identicalnum_blocksexpression.ffsync_num_blocks_underflows_when_commit_ahead_of_hqc— FAILS on current code (overflow panic at the production expression). Would PASS once achecked_sub+ early-return guard (commit_round >= hqc_round) is added.ffsync_num_blocks_ok_when_hqc_ahead_of_commit— passes; guards against an over-broad fix breaking the happy path.Evidence
sync_manager.rs:1446is the test's verbatim copy of the productionnum_blocksexpression (line 509). The crate's test profile has overflow checks enabled, so the underflow surfaces as a panic.Suggested fix (not in this PR)
Use
checked_subwith an early return when the node is already ≥ the target (commit round ≥ hqc round).Refs Galxe/gravity-audit#705
🤖 Generated with Claude Code