feat: add attribution fuzzer for e2e randomized testing by svarlamov · Pull Request #1414 · git-ai-project/git-ai

svarlamov · 2026-05-21T13:36:09Z

Summary

Adds a self-contained attribution fuzzer (tests/integration/fuzzer/) that performs randomized, pathological testing of git-ai's attribution system
Uses a char-based oracle: each edit step allocates a unique character mapped to an attribution type (AI or KnownHuman), allowing deterministic verification at blame time without complex state tracking
Tests all git operations: multi-edit commits, amend chains, fast-forward merges, rebases, squash merges, and multi-file interleaving
Includes 21 fixed-seed tests across 3 profiles (standard, rewrite-heavy, checkpoint-heavy) plus a random-seed test that prints its seed on failure for reproduction
Adds Taskfile entries (test:fuzz, test:fuzz:all, test:fuzz:heavy) for running the fuzzer

Design

The fuzzer allocates unique chars (A-Z, a-z, 0-9, then Unicode U+0100+) for each edit step. Each char is mapped to an attribution type. After every commit or rewrite operation, the fuzzer runs git-ai blame and verifies that the author on each line matches the expected attribution for that line's character. This sidesteps the need to track complex state through rewrites — the char on disk tells you what the attribution should be.

Known Findings

The fuzzer identifies real attribution bugs in rewrite operations (amend, rebase, squash) where AI-attributed lines lose their attribution and show as the committer. These are pre-existing bugs, not regressions. Seeds 1, 5, and 8 reliably reproduce these issues.

Test plan

cargo check --tests passes
Fixed-seed tests are deterministic and reproducible
Run task test:fuzz to execute the standard fuzzer suite
Verify that failures are in rewrite operations (known bugs) not in normal commit flows

🤖 Generated with Claude Code

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 6 additional findings.

devin-ai-integration

Devin Review found 1 new potential issue.

🐛 1 issue in files not directly in the diff

🐛 Unregistered '?' character in execute_untracked_interleave causes oracle panic (`tests/integration/fuzzer/operations.rs:4096-4101`)

The execute_untracked_interleave function adds '?' characters to file_state.lines (operations.rs:4100-4101), and the comment at line 4097 claims "the oracle will skip unknown chars during blame verification." However, the oracle's verify_blame function (oracle.rs:176-189) does not skip unknown chars — it panics via unwrap_or_else when self.get(expected_char) returns None for any unregistered character.

When this operation is triggered via CombinedOp::UntrackedInterleave in engine.rs:1205-1214, verify_main_file is called immediately after, passing file_state.lines (which now contains '?' chars) to registry.verify_blame. The lookup fails and the test panics. This will cause spurious fuzzer failures whenever UntrackedInterleave is randomly selected (~1/32 probability per combined op).

View 18 additional findings in Devin Review.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Multiple interleaved edits (AI + Human) per commit cycle - Rewrite ops on same file (amend chains, rebase, squash merge) - Multi-file rapid-fire checkpoint bursts to stress daemon - OverwriteAll and destructive strategies enabled - Removed Untracked attribution type (known design limitation: content after AI checkpoint without subsequent checkpoint gets attributed to AI by design) - Replace cherry-pick with ff-merge (known daemon reflog ambiguity bug with cherry-pick in repos with many commits) Found real bugs: - AI attribution loss during rewrite operations (seeds 1, 5, 8) - Known human edits inserted between AI lines lose attribution - Rapid checkpoint interleaving reveals attribution boundary issues Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Rename RewriteOp::CherryPick to FfMerge to match actual behavior - Rename execute_cherry_pick_same_file to execute_ff_merge - Remove unused parameters (_file_state, _allow_destructive) - Pass actual seed to verify_blame instead of hardcoded 0 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…to fuzzer Three major enhancements to the attribution fuzzer: 1. Partial staging: tests line-level partial commits, selective file commits, and interleaved partial commits across multiple files. Forces git-ai to correctly split working log entries between committed and uncommitted attribution. 2. Session verification: after each commit, verifies that the authorship note contains the correct session types (AI sessions for AI lines, h_ entries for human lines). Catches session data loss during rewrite operations. 3. Destructive/pathological operations: hard reset, soft reset + recommit, checkout discard, stash/pop cycles, dirty branch switches, reset-and-reedit, and checkpoint-then-overwrite. Stresses the daemon with rapid HEAD changes and discarded working state. New fuzzer profiles: partial_stage_heavy (60% partial ops), destructive_heavy (50% destructive ops). New Taskfile entries: test:fuzz:partial, test:fuzz:destructive. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

New operation categories: - File operations: rename (git mv), delete+recreate, move to subdirectory, concurrent multi-file creation - Stress operations: rapid checkpoint bursts (5-15 rapid-fire), double commit rapid fire, alternating amend (3-6 AI/human flips), amend attribution flip, multi-commit rebase (3-5 commits) - Enhanced destructive: mixed reset, stash with pathspec, orphaned checkpoints (fire then discard), empty commit interleaving - Enhanced partial staging: squash merge with partial staging New fuzzer profiles: - file_ops_heavy: 45% file operations - stress_heavy: 55% stress operations - chaos: equal distribution across ALL operation types (max pathological) New test suites: fuzz_file_ops_*, fuzz_stress_*, fuzz_chaos_* (including random seed chaos test). Total: 51 fuzzer test cases. Updated Taskfile with test:fuzz:partial and test:fuzz:destructive targets. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

New pathological operations: - Thrash: rapid cycle of edit→commit→edit→discard/amend/recommit - Rebase-then-amend: rebase branch, then immediately amend the result - Checkpoint on non-existent file: fire checkpoint before file exists - Two-branch merge: create divergent branches, merge both back (true merge commit with multiple parents) - Exponential amend: double file size each amend step (1→2→4→8→16→32) - Session interleave: 4-10 alternating AI/human edits with mixed strategies (append/prepend/insert) before a single commit Total operations available: 33 distinct pathological patterns across 5 categories (rewrite, destructive, partial staging, file ops, stress). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…e, multi-squash, and more Adds 10 new pathological operations: - cherry_pick_conflict: cherry-pick with deliberate conflicts - rapid_branch_merge: rapid create-commit-merge branch cycles - rebase_cherry_pick_combo: interleaved rebase and cherry-pick - reset_edit_recommit: mixed reset then re-edit and recommit - checkpoint_storm: 5-15 rapid-fire checkpoints before single commit - partial_amend_flip: partial stage + amend with flipped attribution - discard_then_reedit: checkout discard then new attribution - create_delete_batch: batch file creation then random deletion - multi_squash: N commits squashed into one via soft reset - alternating_amend_storm: 4-10 rapid amends alternating AI/human New CombinedOp category in generators with combined_heavy profile. Taskfile entry for test:fuzz:combined added. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… more combined ops New combined operations: - rename_chain: sequential renames A→B→C→D with edits between - fixup_squash: main commit + N fixup commits then squash - empty_tree_rebuild: delete all files, commit, recreate from scratch - revert_then_redo: commit, revert, then new attribution - selective_multi_file_commit: edit multiple files, commit in batches - amend_with_deletion: amend a commit to also delete a file - recommit_loop: repeated soft-reset + recommit cycles Total combined ops now: 14 variants across all pathological patterns. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…cases New operations targeting specific bug-prone areas: - initial_carryover: multiple checkpoint rounds without commit - merge_conflict_resolve: branch merge with conflict resolution - double_checkpoint_race: rapid AI→human→AI checkpoints on same file - hunk_partial_stage: stage only first hunk, commit, then rest - rename_during_edit: rename one file while editing another in same commit - noop_overwrite: checkpoint identical content then real edit - concurrent_sessions: multiple AI/human sessions interleaved - amend_shrink: amend that reduces file size (removes lines) Total combined ops: 22 variants. Total operations across all categories: 57. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… merge, and more New operations targeting daemon sequencer and history rewriting: - deep_rebase_chain: N-deep branch rebase (3-7 commits) onto diverged base - untracked_interleave: edits without checkpoints mixed with real attributions - rapid_head_change: multiple commits then hard reset to middle, new branch - three_way_merge: create two branches, merge both back (octopus-style) - edge_case_commit_flags: empty messages, long messages, special chars Total combined ops: 27 variants. Total operation types across all categories: 62. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…d more extreme ops Final batch of pathological operations: - rapid_lifecycle: checkpoint→commit→amend cycles in rapid succession - multi_stash: create multiple stash entries, pop in sequence - overwrite_and_rollback: OverwriteAll + soft reset + new content - cherry_pick_chain: N sequential cherry-picks from a source branch - interleaved_amend_new: alternating new commits and amends Total combined ops: 32 variants. Total unique operation types across all categories: 67. Test count: 58 test cases across 9 profiles. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Marathon tests run 150-200 operations in chaos mode for maximum coverage. Marked #[ignore] so they don't run in normal test suite but can be invoked via `task test:fuzz:marathon`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ion holes Adds squash-specific operations that replicate real-world user reports of missing data/holes in attribution after squash merges: - squash_mixed_attribution: 4-8 commits alternating AI/human at different positions - squash_after_amend: branch commits amended before squash (pre-rewritten notes) - squash_then_amend: squash then immediately amend (most common hole cause) - squash_rebased_branch: rebase branch then squash (double rewrite) - squash_with_overwrites: later commits overwrite earlier lines then squash - squash_multi_file: multiple files with different attributions squashed - squash_reset_recommit: squash, soft reset, recommit (double-squash pattern) - squash_nonlinear_branch: branch with merge commits then squash Also adds: - squash_heavy FuzzerConfig profile (55% combined ratio) - 7 fixed-seed + 1 random-seed squash test cases - task test:fuzz:squash Taskfile entry All seeds immediately find the known "AI lines present but no AI session" bug, confirming the fuzzer correctly catches the reported squash issues. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Implements exact session tracking in the oracle: after each successful verification, all session IDs (s_* for AI, h_* for human) are extracted from the authorship note and stored. On subsequent verifications, the oracle asserts that ALL previously-committed sessions are still present in the current HEAD's note — sessions must never disappear through rewrite operations (amend, squash, rebase, cherry-pick). This catches the specific bug class where sessions representing "failed paths" (overwritten contributions) are lost during rewrites. The invariant: sessions accumulate monotonically; a commit's authorship note must be a superset of all source commits' sessions. Session tracking is reset before destructive operations (hard reset, branch switch, thrash, mixed reset) that legitimately drop commits, and before combined ops that involve resets (ResetEditRecommit, EmptyTree, RecommitLoop, RapidHeadChange, OverwriteAndRollback, SquashResetRecommit). Immediately finds real bugs: seed 0 shows 3 AI sessions lost after an amend operation, confirming the rewrite hooks don't carry forward all source sessions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove dead code (empty if-bodies, unused snapshot_sessions/restore_sessions), remove incorrect #[allow(dead_code)], cap operation_log growth at 500 entries to bound memory in marathon mode, and add workflow operations module. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Fix execute_fixup_autosquash: abort failed rebase before retrying with different base (prevents "rebase already in progress" error) - Fix execute_stash_pop_cycle: use read_file_state_from_disk helper instead of raw fs::read_to_string (handles missing file after pop) - Fix execute_cherry_pick_conflict: use .ok() for abort (abort can fail if cherry-pick didn't leave conflicted state) - Fix execute_amend_chain: make amend fallible with early return instead of panicking if amend fails Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…rathon Global sh: vars (TEST_BINARY_ARGS) aren't re-evaluated when subtask vars override NO_CAPTURE/EXTRA_TEST_BINARY_ARGS. Add SUBTASK_BINARY_ARGS param to test:base that subtasks can set directly, bypassing the sh: evaluation timing issue. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When multiple valid chains exist in the HEAD reflog (common in repos with many operations), pick the most recent one instead of erroring. Since we iterate chronologically (oldest first), the last match corresponds to the most recent reflog entries and is the correct chain for the daemon. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Fix 1: Panic on blame line count mismatch when verifiable chars exist (instead of silently skipping). Divergence only tolerated after mark_all_unverifiable. - Fix 5: verify_multi_file_commit checks all files in a commit have their attribution tracked in the note. - Fix 9: verify_note_line_ranges parses note attestation entries and verifies AI sessions only claim AI-attributed lines and vice versa. - Fix 10: verify_note_schema validates note structure (separator, valid JSON, required keys, attestation format). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Root causes fixed: - fast-import race on refs/notes/ai: concurrent note writes would silently fail when the ref tip moved between read and write. Added retry loop with backoff for both notes_add_batch and notes_add_blob_batch. - Rebase skipping commits with existing notes: rewrite_authorship_after_rebase_v2 would skip new commits that already had a note, even when the original had AI attestation data not present in the new note. Now always reprocesses when the original has AI data. - Empty pathspecs during rewrite-path post-commit: when working log is empty after a rewrite, fall back to final_state_override keys as pathspecs. - Double-processing overwriting valid notes: never overwrite an existing note that has more attestation entries than the newly-generated one. - Family sequencer blocking checkpoints behind PendingRoot: checkpoints are now extracted eagerly and sorted before commands to ensure working log is populated before commands read it. - Synthetic human replay on already-archived commits: skip replay when old-{sha} archive already exists. - Trace payload execution without family lock: acquire side_effect_exec_lock. - Fuzzer: use dirty_files for all checkpoints (pre and post edit) to eliminate disk-read races, add sync_daemon_force after stash ops, use reset --hard for stash conflict recovery, add file existence checks in oracle. Verified: 3 consecutive clean runs of 71 tests at 12 threads, 0 failures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

svarlamov · 2026-06-01T22:06:22Z

superseded

devin-ai-integration Bot reviewed May 21, 2026

View reviewed changes

This comment was marked as resolved.

Sign in to view

devin-ai-integration Bot reviewed May 22, 2026

View reviewed changes

svarlamov and others added 23 commits May 28, 2026 22:47

docs: add attribution fuzzer design spec

18d60de

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat: add attribution fuzzer for e2e randomized testing

f51c2de

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chore: add task test:fuzz commands for attribution fuzzer

ce5888b

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs: add attribution fuzzer implementation plan

c674e9b

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

svarlamov force-pushed the feat/attr-fuzzer branch from 352be58 to fc8e7eb Compare May 28, 2026 22:56

svarlamov closed this Jun 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add attribution fuzzer for e2e randomized testing#1414

feat: add attribution fuzzer for e2e randomized testing#1414
svarlamov wants to merge 23 commits into
mainfrom
feat/attr-fuzzer

svarlamov commented May 21, 2026 •

edited by devin-ai-integration Bot

Loading

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

svarlamov commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

svarlamov commented May 21, 2026 • edited by devin-ai-integration Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Design

Known Findings

Test plan

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

🐛 Unregistered '?' character in execute_untracked_interleave causes oracle panic (tests/integration/fuzzer/operations.rs:4096-4101)

Uh oh!

svarlamov commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

svarlamov commented May 21, 2026 •

edited by devin-ai-integration Bot

Loading

🐛 Unregistered '?' character in execute_untracked_interleave causes oracle panic (`tests/integration/fuzzer/operations.rs:4096-4101`)