Skip to content

fix(github): route Octokit by host to support GitHub Enterprise#2204

Open
Aarekaz wants to merge 2 commits into
generalaction:mainfrom
Aarekaz:emdash/ghe-pr-support
Open

fix(github): route Octokit by host to support GitHub Enterprise#2204
Aarekaz wants to merge 2 commits into
generalaction:mainfrom
Aarekaz:emdash/ghe-pr-support

Conversation

@Aarekaz
Copy link
Copy Markdown

@Aarekaz Aarekaz commented May 23, 2026

Closes #2181.

Summary

  • Routes Octokit lookups by host so Enterprise URLs hit https://{host}/api/v3 instead of always hitting public api.github.com
  • Generalizes the URL parser to capture host (was hardcoded to github.com)
  • Propagates host through pr-sync-engine, issue-service, and gh-cli-token so PR sync, PR creation, and merge all use the right API
  • Preserves existing behavior for github.com users — the parser still defaults the shorthand owner/repo to github.com, and the create-PR target picker is still github.com-only (broadening it to known Enterprise hosts is a follow-up)

Root cause

Two hardcodes:

  1. Octokit was always constructed without a baseUrl, so every API call hit api.github.com regardless of the actual repo host
  2. The URL parser regexes only matched github.com, so GHE remotes parsed to null — surfacing as "invalid GitHub repository"

That's why the reporter sees PR creation fail and existing PRs not getting found and Project settings appear broken: they all funnel through the same parser + Octokit.

Design notes

  • The parser is intentionally permissive about host. Enterprise instances can live on any domain, so there's no reliable URL-only way to tell "this is a GitHub-compatible server." Verifying that a host actually speaks the GitHub API moves to the call layer (auth failure becomes a clearer error than silent rejection).
  • For callers that do want a strict github.com-only check (currently the create-PR target picker), a new isGitHubDotCom(ref) helper is exported. Used in target-remote.ts to preserve previous behavior.
  • getOctokit(host) caches per (host, token) so multiple Enterprise hosts coexist cleanly and token rotation only invalidates the affected entry.
  • For non-github.com hosts, getToken(host) reads directly from gh auth token --hostname <host> because emdash's encrypted secure storage only holds one token under a single key. This means GHE users rely on gh auth login --hostname … being configured, which is the standard GHE workflow.

Test plan

  • pnpm run format
  • pnpm run lint
  • pnpm run typecheck
  • pnpm test — 834 passing
  • New parser tests cover github.com SSH/HTTPS, GHE SSH/HTTPS, ports, case normalization, shorthand
  • New pr-sync-engine.host-routing.test.ts proves the engine routes a GHE URL to getOctokit('ghe.example.com') and a github.com URL to getOctokit('github.com')
  • Manual: github.com PR creation still works locally

What I couldn't validate

I don't have access to a GitHub Enterprise instance, so the end-to-end "create a PR against GHE" path is not directly verified. The unit + integration tests cover the routing logic, and the github.com path is fully validated, but the actual api/v3 round-trip against a real GHE 3.x server would benefit from a second pair of eyes.

@bondfelix — if you have a moment, would you mind pulling this branch and confirming PR creation works on your GHE instance? gh pr checkout from this PR should pick up the branch.

Out of scope (intentional follow-ups)

  • Broadening the create-PR target picker to include Enterprise remotes (currently filters to github.com to keep this PR minimal)
  • A GHE host setting in the UI (auto-detection from the repo remote should be enough for the bug fix)
  • Multi-host repo browsing in the "Add Project" picker

Closes generalaction#2181.

Before this change, emdash always called the public GitHub API
(api.github.com) regardless of where the repo actually lived, and the
URL parser hardcoded github.com — so GitHub Enterprise users had PR
creation and sync silently fail even with a valid `gh auth` setup.

What changed:

- `GitHubRepositoryRef` now carries the parsed `host`, and the SSH/HTTPS
  regexes accept any hostname. The shorthand `owner/repo` still defaults
  to github.com.
- `getOctokit(host)` caches one client per (host, token) and sets
  `baseUrl` to `https://{host}/api/v3` for non-github.com hosts.
- `extractGhCliToken(ctx, host)` passes `--hostname` to the gh CLI so
  the right per-host token is read from gh's auth state.
- `pr-sync-engine`, `issue-service`, and `repo-service` propagate the
  host they parse from the repo URL to every Octokit lookup.
- `target-remote.ts` (create-PR picker) now uses an explicit
  `isGitHubDotCom` filter to preserve the previous github.com-only
  behavior — broadening it to known Enterprise hosts is a follow-up.

The parser is intentionally permissive about host because Enterprise
instances can live on any domain; verifying a host actually speaks the
GitHub API moves to the call layer (auth failure → clearer error).
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 23, 2026

Greptile Summary

This PR fixes GitHub Enterprise Server support by routing Octokit lookups by host, generalizing the URL parser to capture any hostname, and propagating host through the PR sync engine, issue service, and CLI token helpers. Existing github.com behavior is fully preserved — the shorthand owner/repo still defaults to github.com, and the create-PR target picker is guarded with the new isGitHubDotCom helper.

  • GitHubRepositoryRef gains a host field; parseGitHubRepository becomes permissive about hostnames, with isGitHubDotCom(ref) exported for callers that need a strict github.com check.
  • getOctokit(host) now caches per (host, token) pair using apiBaseUrlForHost to construct the correct base URL; for Enterprise hosts the token is read directly from gh auth token --hostname <host> since emdash's encrypted store only holds the github.com token.
  • All nine getOctokit call sites in PrSyncEngine and the issue/repo services are consistently updated; new host-routing integration tests cover GHE SSH/HTTPS and github.com variants.

Confidence Score: 4/5

Safe to merge — the github.com path is unchanged and well-tested; GHE routing logic is mechanical and covered by new integration tests.

The host-routing propagation is thorough and consistent across all nine getOctokit call sites. The two non-blocking concerns are: (1) for GHE hosts, a gh auth token subprocess is spawned on every getOctokit call regardless of cache state, which could add noticeable latency during heavy PR sync cycles; and (2) the URL regex silently strips www. from any hostname, not just github.com, which could misdirect API calls for rare Enterprise deployments reachable only via a www. subdomain. Neither affects the github.com path or introduces incorrect data.

src/main/core/github/services/octokit-provider.ts — the token fetch / cache-lookup order means GHE users pay a subprocess cost per API call even on cache hits.

Important Files Changed

Filename Overview
src/shared/github-repository.ts Core parser refactored to be permissive about host; adds host field to GitHubRepositoryRef, new isGitHubDotCom and apiBaseUrlForHost helpers. Minor: www. prefix is silently stripped from all hostnames, not just github.com.
src/main/core/github/services/octokit-provider.ts Cache migrated from a single (token, Octokit) pair to a Map keyed on `${host}
src/main/core/pull-requests/pr-sync-engine.ts All nine getOctokit() call sites consistently updated to getOctokit(host), with host destructured from the parsed repository ref. Changes are mechanical and complete.
src/main/core/github/services/github-connection-service.ts getToken and getUserInfo gain optional host parameter; GHE token reads delegate to gh CLI per-host auth. getCurrentUser intentionally remains github.com-only.
src/renderer/features/tasks/diff-view/changes-panel/components/pr-entry/target-remote.ts Adds isGitHubDotCom guard so the create-PR target picker stays github.com-only while the parser now accepts all hosts.

Sequence Diagram

sequenceDiagram
    participant E as PrSyncEngine / IssueService
    participant OP as getOctokit(host)
    participant CS as GitHubConnectionService.getToken(host)
    participant GH as gh CLI (subprocess)
    participant Cache as Octokit Cache (host|token)
    participant API as GitHub API (api.github.com or host/api/v3)

    E->>OP: getOctokit("ghe.corp.com")
    OP->>CS: getToken("ghe.corp.com")
    CS->>GH: gh auth token --hostname ghe.corp.com
    GH-->>CS: token T
    CS-->>OP: token T
    OP->>Cache: "lookup "ghe.corp.com|T""
    alt cache miss
        Cache-->>OP: null
        OP->>OP: "new Octokit({ auth: T, baseUrl: "https://ghe.corp.com/api/v3" })"
        OP->>Cache: "set "ghe.corp.com|T" to Octokit"
    else cache hit
        Cache-->>OP: existing Octokit
    end
    OP-->>E: Octokit
    E->>API: REST / GraphQL request
Loading
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
src/main/core/github/services/octokit-provider.ts:13-25
**GHE token exec on every call, even cache hits**

`githubConnectionService.getToken(host)` is called unconditionally before the cache lookup. For GHE hosts, `getToken` always execs `gh auth token --hostname <host>` — there is no TTL cache for Enterprise tokens. This means every call to `getOctokit('ghe.example.com')` spawns a subprocess even when a valid `Octokit` instance is already in `cache`. For a PR sync cycle that calls `getOctokit` across `runFullSync`, `runIncrementalSync`, `getPullRequestFiles`, etc. in quick succession, this adds a subprocess per call. Consider a lightweight token store keyed by host (analogous to `_tokenRecordCache` used for `github.com`) or move the token fetch inside the cache-miss branch and add a separate short-lived per-host token cache.

### Issue 2 of 2
src/shared/github-repository.ts:54-57
**`www.` prefix stripped from any hostname, not just `github.com`**

The regex `(?:www\.)?([^/\s]+)` silently drops the `www.` prefix before capturing the host, so `https://www.ghe.corp.com/org/repo` is parsed with `host: 'ghe.corp.com'` instead of `www.ghe.corp.com`. `apiBaseUrlForHost('ghe.corp.com')` would then construct `https://ghe.corp.com/api/v3`, which may not resolve if the Enterprise instance is only reachable at `www.ghe.corp.com`. The `www.` normalization is intentional for `github.com`, but could silently misdirect API calls for Enterprise instances whose DNS name actually begins with `www.`.

Reviews (1): Last reviewed commit: "fix(github): route Octokit by host to su..." | Re-trigger Greptile

Comment on lines +13 to 25
export async function getOctokit(host: string = GITHUB_DOT_COM_HOST): Promise<Octokit> {
const token = await githubConnectionService.getToken(host);
if (!token) throw new Error('Not authenticated');
if (token !== cachedToken) {
cachedOctokit = new Octokit({ auth: token });
cachedToken = token;
}
return cachedOctokit!;
const key = cacheKey(host, token);
const cached = cache.get(key);
if (cached) return cached;
const octokit = new Octokit({
auth: token,
baseUrl: apiBaseUrlForHost(host),
});
cache.set(key, octokit);
return octokit;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 GHE token exec on every call, even cache hits

githubConnectionService.getToken(host) is called unconditionally before the cache lookup. For GHE hosts, getToken always execs gh auth token --hostname <host> — there is no TTL cache for Enterprise tokens. This means every call to getOctokit('ghe.example.com') spawns a subprocess even when a valid Octokit instance is already in cache. For a PR sync cycle that calls getOctokit across runFullSync, runIncrementalSync, getPullRequestFiles, etc. in quick succession, this adds a subprocess per call. Consider a lightweight token store keyed by host (analogous to _tokenRecordCache used for github.com) or move the token fetch inside the cache-miss branch and add a separate short-lived per-host token cache.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/main/core/github/services/octokit-provider.ts
Line: 13-25

Comment:
**GHE token exec on every call, even cache hits**

`githubConnectionService.getToken(host)` is called unconditionally before the cache lookup. For GHE hosts, `getToken` always execs `gh auth token --hostname <host>` — there is no TTL cache for Enterprise tokens. This means every call to `getOctokit('ghe.example.com')` spawns a subprocess even when a valid `Octokit` instance is already in `cache`. For a PR sync cycle that calls `getOctokit` across `runFullSync`, `runIncrementalSync`, `getPullRequestFiles`, etc. in quick succession, this adds a subprocess per call. Consider a lightweight token store keyed by host (analogous to `_tokenRecordCache` used for `github.com`) or move the token fetch inside the cache-miss branch and add a separate short-lived per-host token cache.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +54 to +57
const urlMatch = value.match(
/^https?:\/\/(?:www\.)?([^/\s]+)\/([^/\s]+)\/([^/\s?#]+?)(?:\.git)?(?:[/?#].*)?$/i
);
if (urlMatch) return toRepositoryRef(urlMatch[1], urlMatch[2], urlMatch[3]);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 www. prefix stripped from any hostname, not just github.com

The regex (?:www\.)?([^/\s]+) silently drops the www. prefix before capturing the host, so https://www.ghe.corp.com/org/repo is parsed with host: 'ghe.corp.com' instead of www.ghe.corp.com. apiBaseUrlForHost('ghe.corp.com') would then construct https://ghe.corp.com/api/v3, which may not resolve if the Enterprise instance is only reachable at www.ghe.corp.com. The www. normalization is intentional for github.com, but could silently misdirect API calls for Enterprise instances whose DNS name actually begins with www..

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/shared/github-repository.ts
Line: 54-57

Comment:
**`www.` prefix stripped from any hostname, not just `github.com`**

The regex `(?:www\.)?([^/\s]+)` silently drops the `www.` prefix before capturing the host, so `https://www.ghe.corp.com/org/repo` is parsed with `host: 'ghe.corp.com'` instead of `www.ghe.corp.com`. `apiBaseUrlForHost('ghe.corp.com')` would then construct `https://ghe.corp.com/api/v3`, which may not resolve if the Enterprise instance is only reachable at `www.ghe.corp.com`. The `www.` normalization is intentional for `github.com`, but could silently misdirect API calls for Enterprise instances whose DNS name actually begins with `www.`.

How can I resolve this? If you propose a fix, please make it concise.

Two findings from the automated review on the GHE PR:

1. `getOctokit(host)` was shelling out to `gh auth token --hostname` on
   every call for Enterprise hosts, even on Octokit cache hits. Adds a
   per-host TTL token cache mirroring the existing github.com cache so
   the subprocess only fires on a true cache miss.

2. The URL regex collapsed `www.` from any hostname, which could
   misdirect API calls for Enterprise instances whose DNS legitimately
   starts with `www.`. The regex now captures the host verbatim, and
   only `www.github.com` is normalized to `github.com` in the canonical
   form. Added a test covering `https://www.ghe.example.com/...`.
@rabanspiegel rabanspiegel requested a review from jschwxrz May 23, 2026 19:35
@rabanspiegel
Copy link
Copy Markdown
Contributor

Thanks for your contribution, @Aarekaz ! We'll have a look 🙏

@Aarekaz
Copy link
Copy Markdown
Author

Aarekaz commented May 23, 2026

Welcome! Let me know if I am doing anything wrong and how I can improve!!

@Aarekaz
Copy link
Copy Markdown
Author

Aarekaz commented May 24, 2026

Pushed 81a81f78 to address both findings:

  1. Token subprocess on every getOctokit call. Added a per-host TTL cache in the connection service mirroring the existing one for github.com. The subprocess now only fires on a true cache miss or after the 5-minute TTL expires.

  2. www. strip being too broad. The regex now captures the host verbatim, and only www.github.com is collapsed to github.com (since that's its actual redirect). Added a test for https://www.ghe.example.com to lock the behavior in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[bug]: PR creation fails for Github Enterprise Instance

2 participants