Skip to content

Add notes backend option#1253

Merged
svarlamov merged 13 commits into
git-ai-project:mainfrom
a-churchill:andrew/notes-backend
May 13, 2026
Merged

Add notes backend option#1253
svarlamov merged 13 commits into
git-ai-project:mainfrom
a-churchill:andrew/notes-backend

Conversation

@a-churchill
Copy link
Copy Markdown
Contributor

What problems was I solving

Authorship notes are stored in refs/notes/ai, which works fine for small repos but does not scale on big monorepos. Every git push triggers a fetch–merge–push cycle on the notes ref with three retries; in a repo where ~1k commits/hour land and the notes ref takes 10–30s to round-trip, contention means most users will never successfully push their notes.

This PR introduces an opt-in HTTP key-value backend (notes_backend.kind = "http") that replaces the notes ref with a commit-addressable HTTP store, fronted by a per-user SQLite cache and flushed asynchronously by the existing daemon. Push contention disappears (the notes ref is no longer pushed), reads are local-first, and the existing git-notes path remains the default — nothing changes for existing users until they opt in.

What user-facing changes did I ship

  • New opt-in backend — set notes_backend.kind = "http" (and optionally notes_backend.backend_url) in ~/.git-ai/config.json, or via env vars GIT_AI_NOTES_BACKEND_KIND / GIT_AI_NOTES_BACKEND_URL. Default unchanged (git_notes).
  • git ai config set notes_backend.kind http — dotted-key support for the new nested config object.
  • git ai notes migrate — bulk-uploads existing refs/notes/ai content (chunks of 50, with progress + cache population).
  • Transparent reads/writes everywhere elsegit ai log, blame, diff, show, search, rebase/amend, virtual attribution, post-commit hook, range authorship, stats all flow through a single notes_api module that dispatches by backend kind. No surface change when kind = git_notes.
  • Cache pre-warming on git pull — when kind = "http", walks up to 500 commits behind new tips and bulk-fetches notes (chunks of 100) into the local SQLite cache.
  • git push skips refs/notes/ai — when kind = "http", both pre-push and managed-push paths early-return on the notes ref.

Description for the changelog

Add opt-in HTTP commit-addressable authorship notes backend with local SQLite cache, async daemon flush, pull-time cache warming, and a git-ai notes migrate bulk uploader — eliminates refs/notes/ai push contention on large monorepos.

a-churchill and others added 8 commits May 4, 2026 21:22
Phase 0 of the commit-addressable authorship notes backend. Pure
refactor with no behavioral change — all 15 call-site files now route
through `crate::git::notes_api::*` instead of `crate::git::refs::*`,
unblocking subsequent phases that need a single dispatch point for
backend selection.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 1 of the commit-addressable authorship notes backend. Lays the
foundation without changing behavior:

- `notes_backend` nested config object with `kind` (git_notes|http) and
  optional `backend_url`, including env var overrides and CLI dotted-key
  set/get/unset
- New dedicated SQLite database at `~/.git-ai/internal/notes-db` with a
  unified `notes` table (cache + sync queue via `synced` flag), modeled
  on `src/metrics/db.rs`
- API types and `ApiClient::upload_notes` / `read_notes` methods
  following the existing CAS client pattern

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2 of the commit-addressable authorship notes backend. Makes the
HTTP backend functional end-to-end for reads and writes (without the
remote actually being live yet — daemon flush is gated on auth).

- `notes_api` dispatches on `notes_backend.kind`. Http writes go to
  `notes-db` and signal the daemon; reads check the cache and fall
  through to git notes.
- Daemon `flush_notes()` mirrors `flush_cas`: dequeues pending rows
  and posts them via `ApiClient::upload_notes`, marking synced/failed.
- `git ai log` materializes recent notes into `refs/notes/ai-display`
  via `git fast-import` so log output remains correct under the HTTP
  backend.
- Pushes of `refs/notes/ai` are skipped when `kind = http`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 3 of the commit-addressable authorship notes backend. During
`git pull`, walks recent HEAD history and bulk-fetches notes for any
commit not already in `notes-db`, persisting them with `synced = 1`.
Replaces the `refs/notes/ai` fetch entirely on the HTTP backend.

Adds `mockito` as a dev-dependency for offline ApiClient tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 4 of the commit-addressable authorship notes backend. Lets users
migrate existing `refs/notes/ai` entries to the HTTP backend in a
single command: lists notes via `git notes`, bulk-reads content via
`git cat-file --batch`, uploads in chunks of 50, and warms the local
notes-db cache (`synced = 1`) so reads don't need to fall back to git
notes after migration.

Refuses to run unless `notes_backend.kind = http`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 5 of the commit-addressable authorship notes backend. Adds a
seven-group benchmark suite under benches/notes_io.rs covering write
single/batch, read hot/cold, batch read, presence check, and a 50-commit
rebase. Compares the git-notes baseline to the new SQLite-backed HTTP
backend.

Recorded results show the HTTP backend is 100×–7000× faster than the
git-notes path on read and write microbenchmarks (subprocess fork
overhead dominates the git path). All four acceptance thresholds
(≤1.10× reads/batch/rebase, ≤1.20× writes) are met by a wide margin.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@svarlamov svarlamov requested review from heapwolf and svarlamov May 5, 2026 14:49
@a-churchill a-churchill force-pushed the andrew/notes-backend branch from bbce55c to 2d0f858 Compare May 8, 2026 19:53
- flush_notes: only mark synced on zero failures; retry entire batch on partial failure
- materialize_notes_for_display: use `from 0000...` to reset display ref (prevents stale accumulation)
- warm_cache_for_remote: resolve remote's HEAD ref instead of ignoring the remote parameter
- dequeue_pending: replace UPDATE...RETURNING with two-step SELECT+UPDATE for SQLite <3.35 compat
- flush_notes: use Config::fresh() instead of Config::get() so config changes take effect without daemon restart
- reference_server: cap Content-Length at 50MB to prevent OOM
- notes-db: add open_at_path() for test isolation without OnceLock singleton
- notes-db: add evict_stale_cache() with throttled invocation in daemon flush (>10k rows, >90 days)
- notes-db: add get_synced_shas() for migration resume safety
- notes_migrate: skip only synced=1 entries on re-run (pending entries still get uploaded)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@svarlamov svarlamov marked this pull request as ready for review May 12, 2026 18:52
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no bugs or issues to report.

Open in Devin Review

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@svarlamov svarlamov merged commit d122698 into git-ai-project:main May 13, 2026
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants