perf: optimize checkpoint flow with 9 targeted improvements by svarlamov · Pull Request #1292 · git-ai-project/git-ai

svarlamov · 2026-05-06T23:03:36Z

Summary

Add parent-dir keyed repo discovery cache to avoid redundant directory walks for sibling files
Hoist create_dir_all out of per-file async loop and skip existing blob writes (content-addressed dedup)
Wrap dirty_files in Arc to avoid deep-cloning all file contents on every checkpoint
Eliminate redundant second Myers diff by deriving line stats from attribution diff ops (with CRLF normalization)
Avoid entries.clone() by moving entries into Checkpoint and iterating checkpoint.entries for metrics
JSONL append-only instead of full read/prune/rewrite on every checkpoint (lazy compaction at 200KB)
Short-circuit hash migration when no 7-char hashes exist (common case for modern data)
Reverse iteration in build_previous_file_state_maps to skip cloning attributions for overwritten entries

Test plan

Full test suite passes (4,549 tests, 0 failures)
All CRLF-specific tests pass (7 tests validating line stats aren't inflated)
task lint and task fmt clean
Checkpoint perf tests validate p95 < 200ms SLA

🤖 Generated with Claude Code

- Add parent-dir keyed repo discovery cache to avoid redundant directory walks for sibling files in build_checkpoint_files - Hoist create_dir_all out of per-file async loop in save_current_file_states - Skip blob write when content-addressed file already exists - Wrap dirty_files in Arc to avoid deep-cloning all file contents - Eliminate redundant second Myers diff by deriving line stats from attribution diff ops (with CRLF normalization) - Avoid entries.clone() by moving entries into Checkpoint and iterating checkpoint.entries for metrics - JSONL append-only instead of full read/prune/rewrite on every checkpoint (lazy compaction at 200KB threshold) - Short-circuit hash migration when no 7-char hashes exist (common case) - Reverse iteration in build_previous_file_state_maps to avoid cloning attributions for entries that would be overwritten Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The positional comparison in Replace blocks would produce inflated stats when CRLF changes co-occurred with insertions/deletions that shifted line positions. Replace with a sub-diff on normalized slices within the Replace block so the diff algorithm correctly identifies which lines actually changed regardless of line-ending differences. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Proves that the sub-diff approach in Replace blocks correctly handles CRLF→LF conversion combined with deletions/insertions that shift line positions — the exact scenario flagged by Devin's review. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When read_all_checkpoints() fails during compaction (e.g. corrupted JSON line from interrupted write), unwrap_or_default() would return an empty Vec and write_all_checkpoints would erase all data. Now we skip compaction entirely on read failure, preserving existing data. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

This comment was marked as resolved.

Sign in to view

svarlamov and others added 3 commits May 11, 2026 22:45

svarlamov force-pushed the ckpt-perf-may-5-opts-follow-up branch from 71370f7 to f715011 Compare May 11, 2026 22:47

This comment was marked as resolved.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: optimize checkpoint flow with 9 targeted improvements#1292

perf: optimize checkpoint flow with 9 targeted improvements#1292
svarlamov wants to merge 4 commits into
mainfrom
ckpt-perf-may-5-opts-follow-up

svarlamov commented May 6, 2026 •

edited by devin-ai-integration Bot

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

svarlamov commented May 6, 2026 • edited by devin-ai-integration Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

svarlamov commented May 6, 2026 •

edited by devin-ai-integration Bot

Loading