Add notes backend option#1253
Merged
svarlamov merged 13 commits intoMay 13, 2026
Merged
Conversation
Phase 0 of the commit-addressable authorship notes backend. Pure refactor with no behavioral change — all 15 call-site files now route through `crate::git::notes_api::*` instead of `crate::git::refs::*`, unblocking subsequent phases that need a single dispatch point for backend selection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 1 of the commit-addressable authorship notes backend. Lays the foundation without changing behavior: - `notes_backend` nested config object with `kind` (git_notes|http) and optional `backend_url`, including env var overrides and CLI dotted-key set/get/unset - New dedicated SQLite database at `~/.git-ai/internal/notes-db` with a unified `notes` table (cache + sync queue via `synced` flag), modeled on `src/metrics/db.rs` - API types and `ApiClient::upload_notes` / `read_notes` methods following the existing CAS client pattern Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2 of the commit-addressable authorship notes backend. Makes the HTTP backend functional end-to-end for reads and writes (without the remote actually being live yet — daemon flush is gated on auth). - `notes_api` dispatches on `notes_backend.kind`. Http writes go to `notes-db` and signal the daemon; reads check the cache and fall through to git notes. - Daemon `flush_notes()` mirrors `flush_cas`: dequeues pending rows and posts them via `ApiClient::upload_notes`, marking synced/failed. - `git ai log` materializes recent notes into `refs/notes/ai-display` via `git fast-import` so log output remains correct under the HTTP backend. - Pushes of `refs/notes/ai` are skipped when `kind = http`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 3 of the commit-addressable authorship notes backend. During `git pull`, walks recent HEAD history and bulk-fetches notes for any commit not already in `notes-db`, persisting them with `synced = 1`. Replaces the `refs/notes/ai` fetch entirely on the HTTP backend. Adds `mockito` as a dev-dependency for offline ApiClient tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 4 of the commit-addressable authorship notes backend. Lets users migrate existing `refs/notes/ai` entries to the HTTP backend in a single command: lists notes via `git notes`, bulk-reads content via `git cat-file --batch`, uploads in chunks of 50, and warms the local notes-db cache (`synced = 1`) so reads don't need to fall back to git notes after migration. Refuses to run unless `notes_backend.kind = http`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 5 of the commit-addressable authorship notes backend. Adds a seven-group benchmark suite under benches/notes_io.rs covering write single/batch, read hot/cold, batch read, presence check, and a 50-commit rebase. Compares the git-notes baseline to the new SQLite-backed HTTP backend. Recorded results show the HTTP backend is 100×–7000× faster than the git-notes path on read and write microbenchmarks (subprocess fork overhead dominates the git path). All four acceptance thresholds (≤1.10× reads/batch/rebase, ≤1.20× writes) are met by a wide margin. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts: # src/git/mod.rs # src/git/repository.rs
bbce55c to
2d0f858
Compare
- flush_notes: only mark synced on zero failures; retry entire batch on partial failure - materialize_notes_for_display: use `from 0000...` to reset display ref (prevents stale accumulation) - warm_cache_for_remote: resolve remote's HEAD ref instead of ignoring the remote parameter - dequeue_pending: replace UPDATE...RETURNING with two-step SELECT+UPDATE for SQLite <3.35 compat - flush_notes: use Config::fresh() instead of Config::get() so config changes take effect without daemon restart - reference_server: cap Content-Length at 50MB to prevent OOM - notes-db: add open_at_path() for test isolation without OnceLock singleton - notes-db: add evict_stale_cache() with throttled invocation in daemon flush (>10k rows, >90 days) - notes-db: add get_synced_shas() for migration resume safety - notes_migrate: skip only synced=1 entries on re-run (pending entries still get uploaded) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problems was I solving
Authorship notes are stored in refs/notes/ai, which works fine for small repos but does not scale on big monorepos. Every git push triggers a fetch–merge–push cycle on the notes ref with three retries; in a repo where ~1k commits/hour land and the notes ref takes 10–30s to round-trip, contention means most users will never successfully push their notes.
This PR introduces an opt-in HTTP key-value backend (notes_backend.kind = "http") that replaces the notes ref with a commit-addressable HTTP store, fronted by a per-user SQLite cache and flushed asynchronously by the existing daemon. Push contention disappears (the notes ref is no longer pushed), reads are local-first, and the existing git-notes path remains the default — nothing changes for existing users until they opt in.
What user-facing changes did I ship
notes_backend.kind = "http"(and optionallynotes_backend.backend_url) in~/.git-ai/config.json, or via env varsGIT_AI_NOTES_BACKEND_KIND/GIT_AI_NOTES_BACKEND_URL. Default unchanged (git_notes).git ai config set notes_backend.kind http— dotted-key support for the new nested config object.git ai notes migrate— bulk-uploads existingrefs/notes/aicontent (chunks of 50, with progress + cache population).git ai log,blame,diff,show,search, rebase/amend, virtual attribution, post-commit hook, range authorship, stats all flow through a singlenotes_apimodule that dispatches by backend kind. No surface change whenkind = git_notes.git pull— whenkind = "http", walks up to 500 commits behind new tips and bulk-fetches notes (chunks of 100) into the local SQLite cache.git pushskipsrefs/notes/ai— whenkind = "http", both pre-push and managed-push paths early-return on the notes ref.Description for the changelog
Add opt-in HTTP commit-addressable authorship notes backend with local SQLite cache, async daemon flush, pull-time cache warming, and a
git-ai notes migratebulk uploader — eliminatesrefs/notes/aipush contention on large monorepos.