fix(github): route Octokit by host to support GitHub Enterprise#2204
fix(github): route Octokit by host to support GitHub Enterprise#2204Aarekaz wants to merge 2 commits into
Conversation
Closes generalaction#2181. Before this change, emdash always called the public GitHub API (api.github.com) regardless of where the repo actually lived, and the URL parser hardcoded github.com — so GitHub Enterprise users had PR creation and sync silently fail even with a valid `gh auth` setup. What changed: - `GitHubRepositoryRef` now carries the parsed `host`, and the SSH/HTTPS regexes accept any hostname. The shorthand `owner/repo` still defaults to github.com. - `getOctokit(host)` caches one client per (host, token) and sets `baseUrl` to `https://{host}/api/v3` for non-github.com hosts. - `extractGhCliToken(ctx, host)` passes `--hostname` to the gh CLI so the right per-host token is read from gh's auth state. - `pr-sync-engine`, `issue-service`, and `repo-service` propagate the host they parse from the repo URL to every Octokit lookup. - `target-remote.ts` (create-PR picker) now uses an explicit `isGitHubDotCom` filter to preserve the previous github.com-only behavior — broadening it to known Enterprise hosts is a follow-up. The parser is intentionally permissive about host because Enterprise instances can live on any domain; verifying a host actually speaks the GitHub API moves to the call layer (auth failure → clearer error).
Greptile SummaryThis PR fixes GitHub Enterprise Server support by routing Octokit lookups by host, generalizing the URL parser to capture any hostname, and propagating
Confidence Score: 4/5Safe to merge — the github.com path is unchanged and well-tested; GHE routing logic is mechanical and covered by new integration tests. The host-routing propagation is thorough and consistent across all nine getOctokit call sites. The two non-blocking concerns are: (1) for GHE hosts, a gh auth token subprocess is spawned on every getOctokit call regardless of cache state, which could add noticeable latency during heavy PR sync cycles; and (2) the URL regex silently strips www. from any hostname, not just github.com, which could misdirect API calls for rare Enterprise deployments reachable only via a www. subdomain. Neither affects the github.com path or introduces incorrect data. src/main/core/github/services/octokit-provider.ts — the token fetch / cache-lookup order means GHE users pay a subprocess cost per API call even on cache hits.
|
| Filename | Overview |
|---|---|
| src/shared/github-repository.ts | Core parser refactored to be permissive about host; adds host field to GitHubRepositoryRef, new isGitHubDotCom and apiBaseUrlForHost helpers. Minor: www. prefix is silently stripped from all hostnames, not just github.com. |
| src/main/core/github/services/octokit-provider.ts | Cache migrated from a single (token, Octokit) pair to a Map keyed on `${host} |
| src/main/core/pull-requests/pr-sync-engine.ts | All nine getOctokit() call sites consistently updated to getOctokit(host), with host destructured from the parsed repository ref. Changes are mechanical and complete. |
| src/main/core/github/services/github-connection-service.ts | getToken and getUserInfo gain optional host parameter; GHE token reads delegate to gh CLI per-host auth. getCurrentUser intentionally remains github.com-only. |
| src/renderer/features/tasks/diff-view/changes-panel/components/pr-entry/target-remote.ts | Adds isGitHubDotCom guard so the create-PR target picker stays github.com-only while the parser now accepts all hosts. |
Sequence Diagram
sequenceDiagram
participant E as PrSyncEngine / IssueService
participant OP as getOctokit(host)
participant CS as GitHubConnectionService.getToken(host)
participant GH as gh CLI (subprocess)
participant Cache as Octokit Cache (host|token)
participant API as GitHub API (api.github.com or host/api/v3)
E->>OP: getOctokit("ghe.corp.com")
OP->>CS: getToken("ghe.corp.com")
CS->>GH: gh auth token --hostname ghe.corp.com
GH-->>CS: token T
CS-->>OP: token T
OP->>Cache: "lookup "ghe.corp.com|T""
alt cache miss
Cache-->>OP: null
OP->>OP: "new Octokit({ auth: T, baseUrl: "https://ghe.corp.com/api/v3" })"
OP->>Cache: "set "ghe.corp.com|T" to Octokit"
else cache hit
Cache-->>OP: existing Octokit
end
OP-->>E: Octokit
E->>API: REST / GraphQL request
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
src/main/core/github/services/octokit-provider.ts:13-25
**GHE token exec on every call, even cache hits**
`githubConnectionService.getToken(host)` is called unconditionally before the cache lookup. For GHE hosts, `getToken` always execs `gh auth token --hostname <host>` — there is no TTL cache for Enterprise tokens. This means every call to `getOctokit('ghe.example.com')` spawns a subprocess even when a valid `Octokit` instance is already in `cache`. For a PR sync cycle that calls `getOctokit` across `runFullSync`, `runIncrementalSync`, `getPullRequestFiles`, etc. in quick succession, this adds a subprocess per call. Consider a lightweight token store keyed by host (analogous to `_tokenRecordCache` used for `github.com`) or move the token fetch inside the cache-miss branch and add a separate short-lived per-host token cache.
### Issue 2 of 2
src/shared/github-repository.ts:54-57
**`www.` prefix stripped from any hostname, not just `github.com`**
The regex `(?:www\.)?([^/\s]+)` silently drops the `www.` prefix before capturing the host, so `https://www.ghe.corp.com/org/repo` is parsed with `host: 'ghe.corp.com'` instead of `www.ghe.corp.com`. `apiBaseUrlForHost('ghe.corp.com')` would then construct `https://ghe.corp.com/api/v3`, which may not resolve if the Enterprise instance is only reachable at `www.ghe.corp.com`. The `www.` normalization is intentional for `github.com`, but could silently misdirect API calls for Enterprise instances whose DNS name actually begins with `www.`.
Reviews (1): Last reviewed commit: "fix(github): route Octokit by host to su..." | Re-trigger Greptile
| export async function getOctokit(host: string = GITHUB_DOT_COM_HOST): Promise<Octokit> { | ||
| const token = await githubConnectionService.getToken(host); | ||
| if (!token) throw new Error('Not authenticated'); | ||
| if (token !== cachedToken) { | ||
| cachedOctokit = new Octokit({ auth: token }); | ||
| cachedToken = token; | ||
| } | ||
| return cachedOctokit!; | ||
| const key = cacheKey(host, token); | ||
| const cached = cache.get(key); | ||
| if (cached) return cached; | ||
| const octokit = new Octokit({ | ||
| auth: token, | ||
| baseUrl: apiBaseUrlForHost(host), | ||
| }); | ||
| cache.set(key, octokit); | ||
| return octokit; | ||
| } |
There was a problem hiding this comment.
GHE token exec on every call, even cache hits
githubConnectionService.getToken(host) is called unconditionally before the cache lookup. For GHE hosts, getToken always execs gh auth token --hostname <host> — there is no TTL cache for Enterprise tokens. This means every call to getOctokit('ghe.example.com') spawns a subprocess even when a valid Octokit instance is already in cache. For a PR sync cycle that calls getOctokit across runFullSync, runIncrementalSync, getPullRequestFiles, etc. in quick succession, this adds a subprocess per call. Consider a lightweight token store keyed by host (analogous to _tokenRecordCache used for github.com) or move the token fetch inside the cache-miss branch and add a separate short-lived per-host token cache.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/main/core/github/services/octokit-provider.ts
Line: 13-25
Comment:
**GHE token exec on every call, even cache hits**
`githubConnectionService.getToken(host)` is called unconditionally before the cache lookup. For GHE hosts, `getToken` always execs `gh auth token --hostname <host>` — there is no TTL cache for Enterprise tokens. This means every call to `getOctokit('ghe.example.com')` spawns a subprocess even when a valid `Octokit` instance is already in `cache`. For a PR sync cycle that calls `getOctokit` across `runFullSync`, `runIncrementalSync`, `getPullRequestFiles`, etc. in quick succession, this adds a subprocess per call. Consider a lightweight token store keyed by host (analogous to `_tokenRecordCache` used for `github.com`) or move the token fetch inside the cache-miss branch and add a separate short-lived per-host token cache.
How can I resolve this? If you propose a fix, please make it concise.| const urlMatch = value.match( | ||
| /^https?:\/\/(?:www\.)?([^/\s]+)\/([^/\s]+)\/([^/\s?#]+?)(?:\.git)?(?:[/?#].*)?$/i | ||
| ); | ||
| if (urlMatch) return toRepositoryRef(urlMatch[1], urlMatch[2], urlMatch[3]); |
There was a problem hiding this comment.
www. prefix stripped from any hostname, not just github.com
The regex (?:www\.)?([^/\s]+) silently drops the www. prefix before capturing the host, so https://www.ghe.corp.com/org/repo is parsed with host: 'ghe.corp.com' instead of www.ghe.corp.com. apiBaseUrlForHost('ghe.corp.com') would then construct https://ghe.corp.com/api/v3, which may not resolve if the Enterprise instance is only reachable at www.ghe.corp.com. The www. normalization is intentional for github.com, but could silently misdirect API calls for Enterprise instances whose DNS name actually begins with www..
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/shared/github-repository.ts
Line: 54-57
Comment:
**`www.` prefix stripped from any hostname, not just `github.com`**
The regex `(?:www\.)?([^/\s]+)` silently drops the `www.` prefix before capturing the host, so `https://www.ghe.corp.com/org/repo` is parsed with `host: 'ghe.corp.com'` instead of `www.ghe.corp.com`. `apiBaseUrlForHost('ghe.corp.com')` would then construct `https://ghe.corp.com/api/v3`, which may not resolve if the Enterprise instance is only reachable at `www.ghe.corp.com`. The `www.` normalization is intentional for `github.com`, but could silently misdirect API calls for Enterprise instances whose DNS name actually begins with `www.`.
How can I resolve this? If you propose a fix, please make it concise.Two findings from the automated review on the GHE PR: 1. `getOctokit(host)` was shelling out to `gh auth token --hostname` on every call for Enterprise hosts, even on Octokit cache hits. Adds a per-host TTL token cache mirroring the existing github.com cache so the subprocess only fires on a true cache miss. 2. The URL regex collapsed `www.` from any hostname, which could misdirect API calls for Enterprise instances whose DNS legitimately starts with `www.`. The regex now captures the host verbatim, and only `www.github.com` is normalized to `github.com` in the canonical form. Added a test covering `https://www.ghe.example.com/...`.
|
Thanks for your contribution, @Aarekaz ! We'll have a look 🙏 |
|
Welcome! Let me know if I am doing anything wrong and how I can improve!! |
|
Pushed
|
Closes #2181.
Summary
https://{host}/api/v3instead of always hitting publicapi.github.comgithub.com)pr-sync-engine,issue-service, andgh-cli-tokenso PR sync, PR creation, and merge all use the right APIgithub.comusers — the parser still defaults the shorthandowner/repotogithub.com, and the create-PR target picker is stillgithub.com-only (broadening it to known Enterprise hosts is a follow-up)Root cause
Two hardcodes:
Octokitwas always constructed without abaseUrl, so every API call hitapi.github.comregardless of the actual repo hostgithub.com, so GHE remotes parsed tonull— surfacing as "invalid GitHub repository"That's why the reporter sees PR creation fail and existing PRs not getting found and Project settings appear broken: they all funnel through the same parser + Octokit.
Design notes
github.com-only check (currently the create-PR target picker), a newisGitHubDotCom(ref)helper is exported. Used intarget-remote.tsto preserve previous behavior.getOctokit(host)caches per(host, token)so multiple Enterprise hosts coexist cleanly and token rotation only invalidates the affected entry.github.comhosts,getToken(host)reads directly fromgh auth token --hostname <host>because emdash's encrypted secure storage only holds one token under a single key. This means GHE users rely ongh auth login --hostname …being configured, which is the standard GHE workflow.Test plan
pnpm run formatpnpm run lintpnpm run typecheckpnpm test— 834 passingpr-sync-engine.host-routing.test.tsproves the engine routes a GHE URL togetOctokit('ghe.example.com')and a github.com URL togetOctokit('github.com')github.comPR creation still works locallyWhat I couldn't validate
I don't have access to a GitHub Enterprise instance, so the end-to-end "create a PR against GHE" path is not directly verified. The unit + integration tests cover the routing logic, and the
github.compath is fully validated, but the actualapi/v3round-trip against a real GHE 3.x server would benefit from a second pair of eyes.@bondfelix — if you have a moment, would you mind pulling this branch and confirming PR creation works on your GHE instance?
gh pr checkoutfrom this PR should pick up the branch.Out of scope (intentional follow-ups)
github.comto keep this PR minimal)