Skip to content

feat(e2e-harness): drive and snapshot the real wizard TUI#702

Open
gewenyu99 wants to merge 40 commits into
mainfrom
e2e-control-plane
Open

feat(e2e-harness): drive and snapshot the real wizard TUI#702
gewenyu99 wants to merge 40 commits into
mainfrom
e2e-control-plane

Conversation

@gewenyu99

@gewenyu99 gewenyu99 commented Jun 21, 2026

Copy link
Copy Markdown
Collaborator

How to test

Agent route — drive the wizard yourself. In a fresh session in this repo, run the exploring-the-wizard skill. wizard-ci is registered in .mcp.json, so the tools are already bound: open_app boots the real TUI on an app, then read_state / perform_action / render_screen (which returns the real rendered screen).

CI snapshots — real-TUI visual regression. From a wizard-workbench checkout next to this repo (PostHog creds in its .env):

cd ../wizard-workbench && pnpm wizard-ci-snapshots

Runs the full real agent flow against express-todo through the real TUI, captures each key moment, diffs the committed baseline, and writes report.html. Or comment /wizard-ci on a PR — same run, posted back as a comment. (Pairs with PostHog/wizard-workbench#2012.)

What this is

A headless e2e control plane that drives the real wizard TUI and captures what it renders. Both routes share one primitive:

  • Host (scripts/tui-host.no-jest.ts) runs the real startTUI and drives its store by state manipulation — no keystrokes. Auth uses the phx key (same bearer as an OAuth token), so the TUI advances with no browser.
  • Capture (e2e-harness/tui-capture.ts) runs the host in a PTY (node-pty) and reads the real rendered screen via @xterm/headless.

Routes:

  • CI snapshots (tui-snapshots): the fixed e2e profile self-drives the host through the real agent run → one real-TUI text snapshot per key moment (including the run screen's progression), diffed against a committed baseline.
  • Agent (wizard-ci-mcp): an MCP server proxies the host so an agent decides each screen; render_screen returns the real frame. The exploring-the-wizard skill is the how-to.

None of it ships — it lives in e2e-harness/ + scripts/, out of src/.

…ord/replay

A control plane over the TUI store that drives the wizard end-to-end with no
terminal and no browser, for CI/e2e and agent-driven testing. The render is a
pure function of the nanostore, so driving committed state == driving the UI.

Core files (src/lib/ci-driver/):
- wizard-ci-driver.ts — read_state / list_actions / perform_action over a live
  WizardStore. read_state is a truthful, secret-free projection of committed
  state (+ derived currentScreen); perform_action commits via the exact store
  setter the Ink screen's key handler calls.
- action-registry.ts — declarative screen -> commit-action map (exhaustive over
  ScreenId/Overlay). The actuation surface: name an action, not a keystroke.
- wizard-ci-tools.ts — in-process MCP server exposing the three tools, so an
  external harness or LLM can drive a real run.
- e2e-profile.ts — WizardE2eProfile: a program's declarative e2e test definition
  (the UI choices). decideE2eAction(state, profile) maps screen -> commit, so
  the harness is generic and the choices live on the program.
- recorder.ts — captures a frame at each key moment (route/task/status/runPhase/
  overlay change) off the store's version counter; redacts the access token.
- replay.ts — reconstructs a throwaway store per frame and renders the REAL Ink
  screen back to ANSI, so a run replays in the terminal.
- DRIVING-E2E-FROM-AN-AGENT.md — how a future agent drives these.
- __tests__/ — control-plane walk, flow snapshot (TUI-snapshot analog), recorder.

Programs declare their flow's UI choices:
- programs/program-step.ts — ProgramConfig.e2e?: WizardE2eProfile.
- programs/posthog-integration/index.ts — the integration program's e2e profile.

Harness/entry scripts:
- scripts/e2e-full-run.no-jest.ts — headless full run: real WizardStore + InkUI
  (never rendered) + concurrent driver + real runAgent; emits a structured
  result + a recording.
- scripts/replay-e2e.no-jest.ts — replay a recording in the terminal.
- scripts/ci-driver-demo.ts — offline control-plane demo (no agent).

Additive; no core wizard behavior changed. The workbench `wizard-ci --e2e`
(PostHog/wizard-workbench) orchestrates these against real test apps.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown

🧙 Wizard CI

Run the Wizard CI and test your changes against wizard-workbench example apps by replying with a GitHub comment using one of the following commands:

Test all apps:

  • /wizard-ci all

Test all apps in a directory:

  • /wizard-ci basic-integration
  • /wizard-ci error-tracking-upload-source-maps
  • /wizard-ci misc
  • /wizard-ci revenue

Test an individual app:

  • /wizard-ci basic-integration/android
  • /wizard-ci basic-integration/angular
  • /wizard-ci basic-integration/astro
Show more apps
  • /wizard-ci basic-integration/django
  • /wizard-ci basic-integration/fastapi
  • /wizard-ci basic-integration/flask
  • /wizard-ci basic-integration/javascript-node
  • /wizard-ci basic-integration/javascript-web
  • /wizard-ci basic-integration/laravel
  • /wizard-ci basic-integration/next-js
  • /wizard-ci basic-integration/nuxt
  • /wizard-ci basic-integration/python
  • /wizard-ci basic-integration/rails
  • /wizard-ci basic-integration/react-native
  • /wizard-ci basic-integration/react-router
  • /wizard-ci basic-integration/sveltekit
  • /wizard-ci basic-integration/swift
  • /wizard-ci basic-integration/tanstack-router
  • /wizard-ci basic-integration/tanstack-start
  • /wizard-ci basic-integration/vue
  • /wizard-ci error-tracking-upload-source-maps/android
  • /wizard-ci error-tracking-upload-source-maps/cicd-docker-node-raw
  • /wizard-ci error-tracking-upload-source-maps/cicd-github-actions-docker-node-raw
  • /wizard-ci error-tracking-upload-source-maps/cicd-github-actions-nested-docker-node-raw
  • /wizard-ci error-tracking-upload-source-maps/cicd-github-actions-node-raw
  • /wizard-ci error-tracking-upload-source-maps/cicd-gitlab-node-raw
  • /wizard-ci error-tracking-upload-source-maps/cicd-ssh-vps-node-raw
  • /wizard-ci error-tracking-upload-source-maps/flutter
  • /wizard-ci error-tracking-upload-source-maps/ios
  • /wizard-ci error-tracking-upload-source-maps/next
  • /wizard-ci error-tracking-upload-source-maps/next-no-posthog
  • /wizard-ci error-tracking-upload-source-maps/node-raw
  • /wizard-ci error-tracking-upload-source-maps/node-rollup
  • /wizard-ci error-tracking-upload-source-maps/node-rollup-typescript-plugin
  • /wizard-ci error-tracking-upload-source-maps/node-webpack
  • /wizard-ci error-tracking-upload-source-maps/nuxt-3-6
  • /wizard-ci error-tracking-upload-source-maps/nuxt-4-3
  • /wizard-ci error-tracking-upload-source-maps/react-native
  • /wizard-ci error-tracking-upload-source-maps/react-vite
  • /wizard-ci error-tracking-upload-source-maps/rust
  • /wizard-ci misc/quack-quack
  • /wizard-ci revenue/stripe

Results will be posted here when complete.

gewenyu99 and others added 2 commits June 21, 2026 11:11
The e2e UI-choices object moves out of index.ts into a co-located e2e.ts
(POSTHOG_INTEGRATION_E2E_PROFILE), keeping the program config lean and the
flow's test definition in its own file.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
gewenyu99 and others added 12 commits June 22, 2026 12:20
scripts/record-demo.no-jest.ts — produces a recording offline (no agent, no
network) by driving the integration flow with the e2e profile + a WizardRecorder,
so `replay-e2e.no-jest.ts` can be tried without a full run.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
scripts/README.md documents the manual control-plane + record/replay tools
(what each does, what it needs, how to run). Also commits ci-driver-live-agent.ts
(real gateway LLM drives the wizard-ci-tools MCP server) so the index is complete.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
main added two confirm-and-continue intro screens (WarehouseIntro,
SelfDrivingIntro, both call store.completeSetup()). The action-registry
exhaustiveness test flagged them as uncovered. Register both as confirm_setup
in ACTION_REGISTRY and in the e2e walk policy.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…l refs

Move DRIVING-E2E-FROM-AN-AGENT.md → ARCHITECTURE.md to match the co-located
subsystem-doc convention (cf. programs/self-driving/ARCHITECTURE.md). Remove
content that shouldn't ship in the public repo: the internal test project id +
team name, the workbench test-api-key.txt secret file, and pointers to
workbench-only scratch files. Keep the architecture, profiles, record/replay, and
MCP-loop guidance; generalize the run instructions. Update the scripts/README link.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
scripts/render-snapshots.no-jest.ts renders every key-moment frame of a recording
to a real-Ink ANSI snapshot (one <seq>-<screen>.ans per frame), via replay's
renderFrame under tsx. These feed the workbench visual-regression flow.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
None of the control-plane / recording / e2e machinery belongs in the wizard's
production source. Relocate src/lib/ci-driver/ → e2e-harness/ at the repo root
(next to e2e-tests/), and sever every prod coupling:

- Remove the ProgramConfig.e2e field (program-step.ts) and the on-program profile
  (delete posthog-integration/e2e.ts, unwire index.ts). Per-program profiles now
  live in the harness — e2e-harness/profiles.ts, profileFor(programId).
- Add an @e2e-harness/* path alias (tsconfig.build.json + jest moduleNameMapper);
  repoint scripts/tests off @lib/ci-driver.

Result: src/ has ZERO references to the harness, and the published tsdown bundle
contains none of it (previously the ~90-byte profile object shipped). Full suite
(1045 tests, 3 snapshots) passes; real-recording render verified under tsx.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
ARCHITECTURE.md now documents the wizard-ci-snapshots visual-regression flow
(real run → render → diff → side-by-side report) and the env it needs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…gram

A test/ README documents this program's e2e test definition — the path the
headless run walks and the option it auto-takes at each screen (confirm intro,
dismiss outage, first setup option, skip mcp/slack, delete skills). It's the
human description; the runnable profile stays in e2e-harness/profiles.ts. No e2e
machinery returns to prod src — this is documentation only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…oads

Each program declares its e2e test path as src/lib/programs/<program>/test/e2e.json
— a `profile` (the options the headless run auto-takes) plus a documented `path`
of every screen. The harness imports the `profile` in e2e-harness/profiles.ts
(single source of truth, no prose duplication). Matches the repo's existing
JSON-data pattern (mcp-role-prompts.copy.json); resolveJsonModule already on.

It's data, imported only by the harness — zero prod imports, absent from the
tsdown bundle. Full harness suite + runtime load verified.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add the end-to-end trace (agent → perform_action → driver → action-registry →
store.completeSetup → emitChange → router re-resolve → readState) as a comment at
the perform_action tool, with cross-referenced breadcrumbs at the driver hop
(one committed mutation per call) and the action-registry hop (the store setter +
flag-flip the screen sequence reacts to). Harness-only; prod store.ts untouched.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…dule

Add a header note to wizard-ci-tools / wizard-ci-driver / action-registry /
recorder / replay: each lives in e2e-harness/, is imported only by scripts/tests,
and is absent from the tsdown bundle (bin.ts is the only entry). Addresses the
"this looks shippable" worry right where a reader meets the code (esp. the MCP
server + SDK import). Verified: no e2e symbols in dist/.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@gewenyu99 gewenyu99 marked this pull request as ready for review June 22, 2026 20:57

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a test of snapshotting not a snapshot

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instruments the interactivity. We can basically build branching CI on every path we care about.

Moving the trace / never-ships / credentials notes to PR review comments anchored
to the lines instead — keep the source uncluttered.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment thread e2e-harness/wizard-ci-tools.ts Outdated
createSdkMcpServer: (opts: unknown) => unknown;
}> {
if (!_sdkModule) {
_sdkModule = await import('@anthropic-ai/claude-agent-sdk');

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't pollute prod. This imports the agent SDK and the module builds an MCP server — but the whole harness lives in e2e-harness/, out of src/. No production code imports it, and bin.ts is the only tsdown entry, so it's absent from the published bundle. Verified by grepping every dist/*.js for wizard-ci-tools / WizardCiDriver / read_state → zero hits. (The SDK is dynamically imported so the module also loads where the SDK is jest-mocked.)

Comment thread e2e-harness/wizard-ci-tools.ts Outdated
}),
);

const performAction = tool(

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

End to end, one perform_action is a single committed store mutation that re-derives the screen:

agent → mcp__wizard-ci-tools__perform_action {action:"confirm_setup"}
      → driver.performAction("confirm_setup", {})
      → actionsForScreen("intro") finds confirm_setup
      → apply → store.completeSetup()
              → $session.setKey("setupConfirmed", true); emitChange()
              → $version 0→1 → router.resolve(session) skips intro
                (isComplete) → returns "health-check"
      → driver.readState() → { currentScreen:"health-check", actions:[dismiss_outage], … }

The caller then calls read_state and picks the next action. The screen is re-derived from session state, never navigated to.

detectedFrameworkLabel: s.detectedFrameworkLabel,
detectionComplete: s.detectionComplete,
setupConfirmed: s.setupConfirmed,
hasCredentials: s.credentials !== null,

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Secrets never reach a driver LLM. Credentials are reduced to hasCredentials + projectId right here — the accessToken is never serialized into read_state. So the whole state snapshot is safe to hand an external model.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important for safety in CI. No leaked keys

const confirmSetupAction: DriverAction = {
id: 'confirm_setup',
description: 'Confirm the intro and continue (sets setupConfirmed).',
apply: (store) => store.completeSetup(),

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actuation, not keystrokes. apply calls the exact store setter the Ink key handler would: completeSetup() does setKey('setupConfirmed', true) + emitChange(). One commit per action; router.resolve then treats the intro as complete and renders the next screen. The driver names an action — it never injects a keystroke or sees in-progress React-local input.

Comment thread e2e-harness/recorder.ts Outdated
if (!session.credentials) return session;
return {
...session,
credentials: { ...session.credentials, accessToken: 'phx_***redacted***' },

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recordings redact the token too. Every captured frame runs through redactSession, so accessToken becomes phx_***redacted***. Combined with read_state never serializing it, recordings are safe to share as artifacts.

@gewenyu99 gewenyu99 requested a review from a team June 22, 2026 21:06
Comment thread scripts/ci-driver-demo.ts Outdated

@gewenyu99 gewenyu99 Jun 22, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove this beofre merging, same with the other demo files in this dir

Drop the three scripts that were scaffolding while building, not part of the
shipped feature:
- ci-driver-demo.ts        offline no-agent control-loop demo (covered by tests)
- ci-driver-live-agent.ts  manual LLM-drives-MCP proof (needs a key)
- record-demo.no-jest.ts   offline sample-recording generator (real --e2e records)

Keep the three the workbench actually orchestrates: e2e-full-run, render-snapshots,
replay-e2e. Update scripts/README.md + ARCHITECTURE.md accordingly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
gewenyu99 and others added 17 commits June 22, 2026 17:35
EXPLORING-AS-AN-AGENT.md — a runbook for an agent that wants to run/drive/explore
the wizard headlessly: ask the user for a key file path + set env, then either a
full `wizard-ci --e2e` run or a hand-driven read_state→perform_action loop, with
renderFrame to snapshot the TUI for itself to view. Gives wizard-ci-tools its
documented use (agentic exploration). Recipe smoke-tested (intro → health-check,
renders the real screen). ARCHITECTURE.md points at it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…nt behavior

- README: add "Explore with an agent" under Running locally → Testing (was wrongly
  placed in the workbench README).
- scripts/README: drop the cross-PR pointer to the #703 repro scripts.
- Trim header/inline comments across the harness + scripts to concise descriptions
  of what the code does now — no history, no change-rationale.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move e2e-harness/EXPLORING-AS-AN-AGENT.md into .claude/skills/exploring-the-wizard/
so an agent auto-discovers it. Repoint the README + ARCHITECTURE links and list it
in AGENTS.md. ARCHITECTURE.md stays co-located as the how-it-works reference.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…-by-turn

scripts/wizard-ci-mcp.no-jest.ts is a stdio MCP server over one live WizardStore:
read_state / list_actions / perform_action / render_screen / run_agent. An agent
registers it and makes every decision live, instead of the static scripted run.
Rewrite the exploring-the-wizard skill to lead with this. Bump zod ^3.24→^3.25
(the MCP SDK needs the zod/v3 subpath; non-breaking) and add the SDK as a dep.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Same resolved version; just the package.json floor, so #701 and #702 don't
conflict on the zod line.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
read_state already returns the legal actions, so the separate tool is noise.
Keeps the server's surface minimal: read_state, perform_action, render_screen,
run_agent.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…hange

Running prettier on these (not in lint-staged) reflowed the whole files — pure
diff noise. Restore them to main and re-apply just the intended edits: the
"Explore with an agent" section + the exploring-the-wizard skill row.
…d runbook

EXPLORING-AS-AN-AGENT.md was promoted to .claude/skills/exploring-the-wizard/;
this pointer fix was left uncommitted, so HEAD still linked the deleted file.
…ion start

The skill told agents to `claude mcp add` then immediately call the tools, which
is impossible (MCP servers load at session start), so agents fell back to a
script. Lead with the in-session way that actually works — a WizardCiDriver
script (read_state → perform_action → renderFrame), tested — and document the MCP
server as the interactive option that needs registering before a fresh session.
…with it

Connect the stdio transport first and build the store lazily on the first tool
call — detection + the networked health probe used to run before connect(), which
could stall the MCP handshake so Claude Code saw the server as broken. Verified
end-to-end: `claude mcp add` → `claude mcp list` shows ✔ Connected → a headless
session drove read_state → perform_action(confirm_setup) → auth → render_screen.

Skill now leads with the two-phase MCP flow (register, then drive in a fresh
session, since MCP tools bind at session start); the driver script is the fallback.
…drives in one session

Register wizard-ci in .mcp.json so its tools are bound in every session in this
repo. An agent following the exploring-the-wizard skill now drives the wizard over
MCP (open_app -> read_state -> perform_action -> render_screen -> run_agent)
without registering anything or starting a fresh session. The server boots
app-agnostic; open_app picks the app + key at call time, so the committed config
holds no secrets. Skill + README rewritten to the one-session MCP flow.

Verified: a fresh headless agent given only the skill drove the wizard with four
MCP calls and wrote zero scripts.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Just say to point appDir at the directory that has the package.json.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
appDir is just the throwaway copy of the app; let the agent find the path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
auth (and run) are NO_ACTION screens: session.credentials is set only inside
bootstrapProgram, which runs via run_agent. So nothing advances past auth without
run_agent — but the tool description said "call when currentScreen=run" and the
skill walk skipped auth, so an agent landed on auth and polled instead of calling
run_agent. Fix the run_agent description and the skill walk/key-facts to say
run_agent bootstraps creds and advances auth+run; don't poll those screens.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ves the run

A real run_agent call blocked the stdio MCP server for ~3 minutes; the client
treated the server as unhealthy, reconnected, and the restarted process lost its
in-memory store ("No app open", runPhase reset to idle). run_agent now starts the
integration in the background and returns immediately; read_state stays responsive
and reports runPhase running -> completed plus an integration status, so the agent
polls instead of blocking. Skill + tool descriptions updated to the poll model;
noted that run_agent creates real PostHog resources each run.

Proven: run_agent returns in 0.0s; read_state during the run answers in 1-2ms with
runPhase=running.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…or both routes

Both e2e routes run the real wizard TUI (startTUI) driven by store state
manipulation — no keystrokes — and capture the real rendered screen from a PTY.
Auth is satisfied by setCredentials with the phx key (same bearer as an OAuth
token), so the TUI advances with no browser.

- e2e-harness/tui-capture.ts — run a command in a PTY (node-pty), read its screen
  via @xterm/headless.
- scripts/tui-host.no-jest.ts — the real-TUI host. MODE=fixed self-drives the
  fixed e2e profile, signals each screen, writes a structured result JSON;
  MODE=serve takes drive commands over a unix socket.
- scripts/tui-snapshots.no-jest.ts — CI route: real-TUI text snapshot per screen.
- scripts/wizard-ci-mcp.no-jest.ts — agent route: MCP server proxying the host.
- scripts/wizard-ci-explore.no-jest.ts — drive the MCP route, print the real TUI.
- scripts/tui-replay.no-jest.ts — replay captured snapshots in the terminal.

Deletes the record-then-reconstruct machinery (recorder, replay, e2e-full-run,
render-snapshots, replay-e2e) and the in-process wizard-ci-tools server. Adds
node-pty + @xterm/headless.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@gewenyu99 gewenyu99 changed the title feat(ci-driver): wizard-ci-tools control plane for headless e2e + record/replay feat(e2e-harness): drive and snapshot the real wizard TUI Jun 23, 2026
gewenyu99 and others added 5 commits June 22, 2026 22:15
…sition

Snapshot on key moments — a screen change, a task-list update, or a runPhase
change — via a store subscription, and snap each screen before the driver acts on
it. The run screen (the agent working) is captured as it progresses, and fast
transitions (intro/auth/outro/mcp/slack) are no longer skipped by throttling.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ed loop

Snapshot on every key-moment change (no throttle spacing, just a settle). And
don't await the driver loop at exit — on the cheap (no-agent) path it's parked in
waitForChange, so awaiting it hung the process and exited non-zero, which would
fail CI. The process now exits 0 cleanly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The fixed CI route always drives the full real agent run — a no-agent path was
pointless (and is what hung at exit). Removes the RUN_AGENT branch and the
auth-by-state shortcut it needed in fixed mode; auth is bootstrapped by the run.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
node-pty ships no linux-x64 prebuilt, so CI must compile it; pnpm 10 blocks build
scripts unless allowlisted.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
ink renders non-interactively when it detects CI (CI / GITHUB_ACTIONS), leaving
the captured xterm buffer blank. Strip them from the spawned host's env. Verified
locally: with CI=true, render_screen now returns the real TUI instead of blank.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant