feat: support GitHub Copilot CLI (lore run / lore setup)#1157
Merged
Conversation
Contributor
Codecov Results 📊✅ Patch coverage is 100.00%. Project has 6480 uncovered lines. Files with missing lines (1)
Coverage diff@@ Coverage Diff @@
## main #PR +/-##
==========================================
+ Coverage 71.30% 71.37% +0.07%
==========================================
Files 135 135 —
Lines 22602 22637 +35
Branches 15945 15961 +16
==========================================
+ Hits 16117 16157 +40
- Misses 6485 6480 -5
- Partials 1643 1644 +1Generated by Codecov Action |
… upstream Copilot CLI's default (GitHub-hosted) mode is redirected at the gateway via COPILOT_API_URL. It sends OpenAI-format requests with its own exchanged Copilot bearer token and a Copilot-Integration-Id header, but no way to set X-Lore-Provider — so its model ids (gpt-*, claude-*, …) would route by model prefix to the wrong upstream (api.openai.com / api.anthropic.com). - Add hasCopilotIntegrationHeader() and, in forwardToUpstream, force the github-copilot provider route when the header is present and no explicit provider / upstream override exists (X-Lore-Provider, X-Lore-Upstream-URL, and BYOK still win). - Accept the bare /chat/completions ingress path (no /v1), which Copilot posts to the origin of COPILOT_API_URL. The bare /responses (Responses wire) path is intentionally deferred — its upstream builder emits /v1/responses which would 404 against api.githubcopilot.com; it needs a host-aware responses path first. - e2e routing + bare-ingress guards (github-copilot-url.e2e.test.ts pattern).
Register a `copilot` agent whose envVars set COPILOT_API_URL to the gateway origin, redirecting Copilot CLI's default (GitHub-hosted) model calls through Lore. Adding it to AGENTS auto-enables `lore copilot` shorthand and the auto-detect picker. COPILOT_PROVIDER_* are left untouched so BYOK users can point COPILOT_PROVIDER_BASE_URL at the gateway themselves.
Copilot CLI has no config-file endpoint override — interception is only via the COPILOT_API_URL env var. So `lore setup copilot` is guidance-based: it prints the required `lore run copilot` / `export COPILOT_API_URL=…` instructions (bare origin, no /v1) and a BYOK note; `undo` is an informational no-op. The inventory collector reports COPILOT_API_URL from the environment so `lore setup status`/`doctor` show Copilot routing. Update the 5→6 app fan-out assertions.
6ef7f93 to
af1f077
Compare
Contributor
|
BYK
added a commit
that referenced
this pull request
Jul 4, 2026
…pport (#1159) ## What Adds a **first-class native Gemini `generateContent` protocol** to the gateway and wires up the **Gemini CLI** (`@google/gemini-cli`) for `lore run` / `lore setup`. Unlike the pre-existing Gemini *OpenAI-compat* upstream, this speaks Gemini's real wire format end-to-end, so `@ai-sdk/google` clients (Gemini CLI **and** OpenCode's `google` provider) route through Lore with full memory support. **PR 2 of 2** — stacked on #1157 (Copilot CLI). Base is `feat/copilot-cli-support`; will retarget to `main` once #1157 merges. (They both touch the shared CLI-registry files, so stacking avoids conflicts.) ## Why native (not the OpenAI-compat layer) The Gemini CLI and `@ai-sdk/google` speak native `generateContent` (`user`/`model` roles, `systemInstruction`, `functionCall`/`functionResponse` parts, `generationConfig`, `usageMetadata`) — **no OpenAI-compat mode**. To do context injection + distillation, the gateway must understand that format. ## Phases (each a commit) - **A — types**: `gemini` added to `GatewayProtocol` + every inline duplicate union (gateway + core `LLMClient.prompt`). - **B — `translate/gemini.ts`**: parse/build request, parse/build response. Tool calls paired by function name (Gemini has no per-call id). 18 unit tests + round-trip. - **C — `stream/gemini.ts`**: `accumulateGeminiSSEStream` (upstream SSE → internal) + `translateAnthropicStreamToGemini` (buffered, for cross-protocol gemini clients). - **D — pipeline**: `forwardToUpstream` gemini branch (LTM → `systemInstruction`), `accumulateNonStreamResponse`/`nonStreamHttpResponse` cases, streaming dispatch across all sites. - **E+G — ingress + routing**: server accepts `POST /v1beta/models/{model}:generateContent` + `:streamGenerateContent` (version-prefix-agnostic, so `@ai-sdk/google`'s `/v1/models/...` matches too). A native gemini ingress **always stays gemini** and defaults its base to `generativelanguage.googleapis.com`. - **F — workers**: `resolveWorkerProtocol` maps gemini→gemini; `buildGeminiWorkerRequest`/`parseGeminiWorkerResponse`; `extractAuth` captures `x-goog-api-key`. (Worker model uses the generic cost-aware path — no `WORKER_DEFAULTS.google` pin.) - **opencode/pi enablement**: `x-lore-provider: google` (native `@ai-sdk/google`) is no longer downgraded to the OpenAI-compat endpoint — e2e-covered. The existing `google` OpenAI-compat **worker** path is unchanged. - **I — CLI**: `lore run gemini` sets `GOOGLE_GEMINI_BASE_URL` (bare origin; localhost HTTP allowed); `lore setup gemini` persists it to `~/.gemini/.env` (dotenv, like Hermes) with backup/undo; inventory + doctor fan-out (6→7). ## Tests (TDD) `gemini-translate` (18), `gemini-stream` (4), `gemini-ingress.e2e` (5, incl. opencode-style `x-lore-provider: google`), worker-protocol, agents/setup/setup-io/doctor. **Full gateway suite (2621) + core suite (2519) green; typecheck + lint clean.** ## Scope notes - Cache warming is a safe no-op for gemini (Anthropic-cache-specific; `resolveProfile` returns null). - **Pi** `google` provider registration is deferred (Pi doesn't currently register `google`; its native-vs-baseURL semantics need Pi-side verification) — no regression. OpenCode is enabled. - Vertex-Gemini (`GOOGLE_VERTEX_BASE_URL`) and Code-Assist OAuth modes are out of scope; the new translator is the shared core a future Vertex-Gemini mode can reuse. - Not integration-tested against the real Gemini CLI (not available in CI); translators are unit-tested against Google's documented wire format. --- ## Update — adversarial pre-merge review addressed Two independent review passes (verdict: SHIP-WITH-FOLLOWUPS, no BLOCKERs) surfaced fidelity/coverage gaps in the lossy Anthropic-shaped internal representation. All substantive ones are now fixed (commit "address adversarial review"), each guarded by a **mutation-verified non-vacuous** test: - **Thinking leak** — `thought:true` reasoning parts no longer merge into the visible answer; mapped to a distinct thinking block on parse (non-stream + SSE) and re-emitted as `{text,thought:true}` on egress. - **Usage** — `thoughtsTokenCount` folded into `outputTokens` (was undercounting output for cost-aware routing + understating the client total). A shared `geminiUsageFromMetadata` helper removes the duplicate stream copy (drift). - **Safety transparency** — abnormal `finishReason`s (SAFETY/RECITATION/BLOCKLIST/PROHIBITED_CONTENT/SPII/MALFORMED_FUNCTION_CALL/OTHER) preserved verbatim through egress instead of laundered to STOP; the finish-reason mapper is now shared between the translator and the stream accumulator. - **Prompt block** — a prompt-level block (no candidates) surfaces `promptFeedback.blockReason` as the stop reason instead of a fake `end_turn`. - **Worker auth** — `buildGeminiWorkerRequest` is scheme-aware (bearer→`Authorization`, else `x-goog-api-key`), so an OAuth gemini session never misauths. **New `worker-gemini-path.test.ts`** closes the previously-untested worker builder (a broken `x-goog-api-key` header now fails a test — it silently passed the whole suite before). - **`?key=` auth** — normalized to `x-goog-api-key` at ingress (was silently dropped when the upstream URL is rebuilt); the misleading comment is corrected. - **Multi-candidate** — `candidateCount>1 → candidates[0]` limitation documented (single-response internal model); full fan-out is a follow-up. **Full gateway suite: 2634 passed / 40 skipped** (+13 gemini tests). typecheck + lint clean. Mutation-tested each new guard in a disposable worktree — all go RED when the fix is reverted. Deferred (documented, non-blocking): multi-candidate fan-out; full `promptFeedback` object passthrough; Pi `google` provider registration.
7 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds first-class support for GitHub Copilot CLI (
@github/copilot, binarycopilot) tolore runandlore setup, so its model calls flow through the Lore gateway for memory injection + distillation.This is PR 1 of 2 from the plan (Copilot first, Gemini CLI next as a separate PR).
How Copilot is intercepted
Empirically verified against the
@github/copilotbinary: the CLI exposes an undocumentedCOPILOT_API_URLenv var that overrides its default Copilot API base (normallyapi.githubcopilot.com). Copilot speaks OpenAI wire format, does its own GitHub→Copilot token exchange, and sets aCopilot-Integration-Idheader. So Lore just needs to receive that traffic and forward it to the genuineapi.githubcopilot.com— a pure transparent proxy, no token brokering.This intercepts Copilot's default GitHub-hosted models (not just BYOK). BYOK users can alternatively point
COPILOT_PROVIDER_BASE_URLat the gateway themselves; those vars are left untouched.Changes
Gateway
hasCopilotIntegrationHeader()+ routing inforwardToUpstream: when aCopilot-Integration-Idheader is present and there's no explicit provider/upstream override, force thegithub-copilotupstream. Otherwise Copilot's model ids (gpt-*,claude-*, …) would route by model-prefix to the wrong host. ExplicitX-Lore-Provider/X-Lore-Upstream-URL/ BYOK still win./chat/completionsingress path (no/v1) — Copilot posts to the origin ofCOPILOT_API_URL. (The bare/responsesResponses-wire path is intentionally deferred: the responses upstream builder emits/v1/responses, which would 404 againstapi.githubcopilot.com; it needs a host-aware responses path first.)CLI
lore run copilot(+lore copilotshorthand, auto-detect picker): setsCOPILOT_API_URLto the gateway origin.lore setup copilot: Copilot has no config-file endpoint field, so setup is guidance-based — it prints thelore run copilot/export COPILOT_API_URL=…instructions + a BYOK note;undois an informational no-op.COPILOT_API_URLfrom the environment solore setup status/doctorshow Copilot routing.Tests (TDD)
copilot-routing.e2e.test.ts— full-pipeline routing guards:gpt-/claude-models →api.githubcopilot.com; explicit provider/upstream/BYOK override wins; no-header baseline unchanged; bare/chat/completionsingress reaches upstream. Core forcing logic verified non-vacuous (red→green) + mutation-checked (dropping the!providerRouteguard fails the explicit-provider test).agents.test.ts,setup.test.ts,setup-io.test.ts,doctor.test.ts— agent envVars,copilotApiUrlFromBaseUrl, setup round-trip, inventory (5→6 app fan-out).Full gateway suite green; typecheck clean.