feat(e2e-harness): drive and snapshot the real wizard TUI by gewenyu99 · Pull Request #702 · PostHog/wizard

gewenyu99 · 2026-06-21T15:09:00Z

How to test

Agent route — drive the wizard yourself. In a fresh session in this repo, run the exploring-the-wizard skill. wizard-ci is registered in .mcp.json, so the tools are already bound: open_app boots the real TUI on an app, then read_state / perform_action / render_screen (which returns the real rendered screen).

CI snapshots — real-TUI visual regression. From a wizard-workbench checkout next to this repo (PostHog creds in its .env):

cd ../wizard-workbench && pnpm wizard-ci-snapshots

Runs the full real agent flow against express-todo through the real TUI, captures each key moment, diffs the committed baseline, and writes report.html. Or comment /wizard-ci on a PR — same run, posted back as a comment. (Pairs with PostHog/wizard-workbench#2012.)

What this is

A headless e2e control plane that drives the real wizard TUI and captures what it renders. Both routes share one primitive:

Host (scripts/tui-host.no-jest.ts) runs the real startTUI and drives its store by state manipulation — no keystrokes. Auth uses the phx key (same bearer as an OAuth token), so the TUI advances with no browser.
Capture (e2e-harness/tui-capture.ts) runs the host in a PTY (node-pty) and reads the real rendered screen via @xterm/headless.

Routes:

CI snapshots (tui-snapshots): the fixed e2e profile self-drives the host through the real agent run → one real-TUI text snapshot per key moment (including the run screen's progression), diffed against a committed baseline.
Agent (wizard-ci-mcp): an MCP server proxies the host so an agent decides each screen; render_screen returns the real frame. The exploring-the-wizard skill is the how-to.

None of it ships — it lives in e2e-harness/ + scripts/, out of src/.

…ord/replay A control plane over the TUI store that drives the wizard end-to-end with no terminal and no browser, for CI/e2e and agent-driven testing. The render is a pure function of the nanostore, so driving committed state == driving the UI. Core files (src/lib/ci-driver/): - wizard-ci-driver.ts — read_state / list_actions / perform_action over a live WizardStore. read_state is a truthful, secret-free projection of committed state (+ derived currentScreen); perform_action commits via the exact store setter the Ink screen's key handler calls. - action-registry.ts — declarative screen -> commit-action map (exhaustive over ScreenId/Overlay). The actuation surface: name an action, not a keystroke. - wizard-ci-tools.ts — in-process MCP server exposing the three tools, so an external harness or LLM can drive a real run. - e2e-profile.ts — WizardE2eProfile: a program's declarative e2e test definition (the UI choices). decideE2eAction(state, profile) maps screen -> commit, so the harness is generic and the choices live on the program. - recorder.ts — captures a frame at each key moment (route/task/status/runPhase/ overlay change) off the store's version counter; redacts the access token. - replay.ts — reconstructs a throwaway store per frame and renders the REAL Ink screen back to ANSI, so a run replays in the terminal. - DRIVING-E2E-FROM-AN-AGENT.md — how a future agent drives these. - __tests__/ — control-plane walk, flow snapshot (TUI-snapshot analog), recorder. Programs declare their flow's UI choices: - programs/program-step.ts — ProgramConfig.e2e?: WizardE2eProfile. - programs/posthog-integration/index.ts — the integration program's e2e profile. Harness/entry scripts: - scripts/e2e-full-run.no-jest.ts — headless full run: real WizardStore + InkUI (never rendered) + concurrent driver + real runAgent; emits a structured result + a recording. - scripts/replay-e2e.no-jest.ts — replay a recording in the terminal. - scripts/ci-driver-demo.ts — offline control-plane demo (no agent). Additive; no core wizard behavior changed. The workbench `wizard-ci --e2e` (PostHog/wizard-workbench) orchestrates these against real test apps. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

github-actions · 2026-06-21T15:09:11Z

🧙 Wizard CI

Run the Wizard CI and test your changes against wizard-workbench example apps by replying with a GitHub comment using one of the following commands:

Test all apps:

/wizard-ci all

Test all apps in a directory:

/wizard-ci basic-integration
/wizard-ci error-tracking-upload-source-maps
/wizard-ci misc
/wizard-ci revenue

Test an individual app:

/wizard-ci basic-integration/android
/wizard-ci basic-integration/angular
/wizard-ci basic-integration/astro

Show more apps

/wizard-ci basic-integration/django
/wizard-ci basic-integration/fastapi
/wizard-ci basic-integration/flask
/wizard-ci basic-integration/javascript-node
/wizard-ci basic-integration/javascript-web
/wizard-ci basic-integration/laravel
/wizard-ci basic-integration/next-js
/wizard-ci basic-integration/nuxt
/wizard-ci basic-integration/python
/wizard-ci basic-integration/rails
/wizard-ci basic-integration/react-native
/wizard-ci basic-integration/react-router
/wizard-ci basic-integration/sveltekit
/wizard-ci basic-integration/swift
/wizard-ci basic-integration/tanstack-router
/wizard-ci basic-integration/tanstack-start
/wizard-ci basic-integration/vue
/wizard-ci error-tracking-upload-source-maps/android
/wizard-ci error-tracking-upload-source-maps/cicd-docker-node-raw
/wizard-ci error-tracking-upload-source-maps/cicd-github-actions-docker-node-raw
/wizard-ci error-tracking-upload-source-maps/cicd-github-actions-nested-docker-node-raw
/wizard-ci error-tracking-upload-source-maps/cicd-github-actions-node-raw
/wizard-ci error-tracking-upload-source-maps/cicd-gitlab-node-raw
/wizard-ci error-tracking-upload-source-maps/cicd-ssh-vps-node-raw
/wizard-ci error-tracking-upload-source-maps/flutter
/wizard-ci error-tracking-upload-source-maps/ios
/wizard-ci error-tracking-upload-source-maps/next
/wizard-ci error-tracking-upload-source-maps/next-no-posthog
/wizard-ci error-tracking-upload-source-maps/node-raw
/wizard-ci error-tracking-upload-source-maps/node-rollup
/wizard-ci error-tracking-upload-source-maps/node-rollup-typescript-plugin
/wizard-ci error-tracking-upload-source-maps/node-webpack
/wizard-ci error-tracking-upload-source-maps/nuxt-3-6
/wizard-ci error-tracking-upload-source-maps/nuxt-4-3
/wizard-ci error-tracking-upload-source-maps/react-native
/wizard-ci error-tracking-upload-source-maps/react-vite
/wizard-ci error-tracking-upload-source-maps/rust
/wizard-ci misc/quack-quack
/wizard-ci revenue/stripe

Results will be posted here when complete.

The e2e UI-choices object moves out of index.ts into a co-located e2e.ts (POSTHOG_INTEGRATION_E2E_PROFILE), keeping the program config lean and the flow's test definition in its own file. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

scripts/record-demo.no-jest.ts — produces a recording offline (no agent, no network) by driving the integration flow with the e2e profile + a WizardRecorder, so `replay-e2e.no-jest.ts` can be tried without a full run. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

scripts/README.md documents the manual control-plane + record/replay tools (what each does, what it needs, how to run). Also commits ci-driver-live-agent.ts (real gateway LLM drives the wizard-ci-tools MCP server) so the index is complete. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

main added two confirm-and-continue intro screens (WarehouseIntro, SelfDrivingIntro, both call store.completeSetup()). The action-registry exhaustiveness test flagged them as uncovered. Register both as confirm_setup in ACTION_REGISTRY and in the e2e walk policy. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…l refs Move DRIVING-E2E-FROM-AN-AGENT.md → ARCHITECTURE.md to match the co-located subsystem-doc convention (cf. programs/self-driving/ARCHITECTURE.md). Remove content that shouldn't ship in the public repo: the internal test project id + team name, the workbench test-api-key.txt secret file, and pointers to workbench-only scratch files. Keep the architecture, profiles, record/replay, and MCP-loop guidance; generalize the run instructions. Update the scripts/README link. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

scripts/render-snapshots.no-jest.ts renders every key-moment frame of a recording to a real-Ink ANSI snapshot (one <seq>-<screen>.ans per frame), via replay's renderFrame under tsx. These feed the workbench visual-regression flow. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

None of the control-plane / recording / e2e machinery belongs in the wizard's production source. Relocate src/lib/ci-driver/ → e2e-harness/ at the repo root (next to e2e-tests/), and sever every prod coupling: - Remove the ProgramConfig.e2e field (program-step.ts) and the on-program profile (delete posthog-integration/e2e.ts, unwire index.ts). Per-program profiles now live in the harness — e2e-harness/profiles.ts, profileFor(programId). - Add an @e2e-harness/* path alias (tsconfig.build.json + jest moduleNameMapper); repoint scripts/tests off @lib/ci-driver. Result: src/ has ZERO references to the harness, and the published tsdown bundle contains none of it (previously the ~90-byte profile object shipped). Full suite (1045 tests, 3 snapshots) passes; real-recording render verified under tsx. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

ARCHITECTURE.md now documents the wizard-ci-snapshots visual-regression flow (real run → render → diff → side-by-side report) and the env it needs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…gram A test/ README documents this program's e2e test definition — the path the headless run walks and the option it auto-takes at each screen (confirm intro, dismiss outage, first setup option, skip mcp/slack, delete skills). It's the human description; the runnable profile stays in e2e-harness/profiles.ts. No e2e machinery returns to prod src — this is documentation only. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…oads Each program declares its e2e test path as src/lib/programs/<program>/test/e2e.json — a `profile` (the options the headless run auto-takes) plus a documented `path` of every screen. The harness imports the `profile` in e2e-harness/profiles.ts (single source of truth, no prose duplication). Matches the repo's existing JSON-data pattern (mcp-role-prompts.copy.json); resolveJsonModule already on. It's data, imported only by the harness — zero prod imports, absent from the tsdown bundle. Full harness suite + runtime load verified. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Add the end-to-end trace (agent → perform_action → driver → action-registry → store.completeSetup → emitChange → router re-resolve → readState) as a comment at the perform_action tool, with cross-referenced breadcrumbs at the driver hop (one committed mutation per call) and the action-registry hop (the store setter + flag-flip the screen sequence reacts to). Harness-only; prod store.ts untouched. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…dule Add a header note to wizard-ci-tools / wizard-ci-driver / action-registry / recorder / replay: each lives in e2e-harness/, is imported only by scripts/tests, and is absent from the tsdown bundle (bin.ts is the only entry). Addresses the "this looks shippable" worry right where a reader meets the code (esp. the MCP server + SDK import). Verified: no e2e symbols in dist/. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

gewenyu99 · 2026-06-22T20:58:50Z

This is a test of snapshotting not a snapshot

gewenyu99 · 2026-06-22T21:00:09Z

Instruments the interactivity. We can basically build branching CI on every path we care about.

Moving the trace / never-ships / credentials notes to PR review comments anchored to the lines instead — keep the source uncluttered. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

gewenyu99 · 2026-06-22T21:04:41Z

+  createSdkMcpServer: (opts: unknown) => unknown;
+}> {
+  if (!_sdkModule) {
+    _sdkModule = await import('@anthropic-ai/claude-agent-sdk');


Doesn't pollute prod. This imports the agent SDK and the module builds an MCP server — but the whole harness lives in e2e-harness/, out of src/. No production code imports it, and bin.ts is the only tsdown entry, so it's absent from the published bundle. Verified by grepping every dist/*.js for wizard-ci-tools / WizardCiDriver / read_state → zero hits. (The SDK is dynamically imported so the module also loads where the SDK is jest-mocked.)

gewenyu99 · 2026-06-22T21:04:41Z

+      }),
+  );
+
+  const performAction = tool(


End to end, one perform_action is a single committed store mutation that re-derives the screen:

agent → mcp__wizard-ci-tools__perform_action {action:"confirm_setup"} → driver.performAction("confirm_setup", {}) → actionsForScreen("intro") finds confirm_setup → apply → store.completeSetup() → $session.setKey("setupConfirmed", true); emitChange() → $version 0→1 → router.resolve(session) skips intro (isComplete) → returns "health-check" → driver.readState() → { currentScreen:"health-check", actions:[dismiss_outage], … }

The caller then calls read_state and picks the next action. The screen is re-derived from session state, never navigated to.

gewenyu99 · 2026-06-22T21:04:42Z

+        detectedFrameworkLabel: s.detectedFrameworkLabel,
+        detectionComplete: s.detectionComplete,
+        setupConfirmed: s.setupConfirmed,
+        hasCredentials: s.credentials !== null,


Secrets never reach a driver LLM. Credentials are reduced to hasCredentials + projectId right here — the accessToken is never serialized into read_state. So the whole state snapshot is safe to hand an external model.

Important for safety in CI. No leaked keys

gewenyu99 · 2026-06-22T21:04:43Z

+const confirmSetupAction: DriverAction = {
+  id: 'confirm_setup',
+  description: 'Confirm the intro and continue (sets setupConfirmed).',
+  apply: (store) => store.completeSetup(),


Actuation, not keystrokes. apply calls the exact store setter the Ink key handler would: completeSetup() does setKey('setupConfirmed', true) + emitChange(). One commit per action; router.resolve then treats the intro as complete and renders the next screen. The driver names an action — it never injects a keystroke or sees in-progress React-local input.

gewenyu99 · 2026-06-22T21:04:44Z

+  if (!session.credentials) return session;
+  return {
+    ...session,
+    credentials: { ...session.credentials, accessToken: 'phx_***redacted***' },


Recordings redact the token too. Every captured frame runs through redactSession, so accessToken becomes phx_***redacted***. Combined with read_state never serializing it, recordings are safe to share as artifacts.

gewenyu99 · 2026-06-22T21:24:03Z

I will remove this beofre merging, same with the other demo files in this dir

Drop the three scripts that were scaffolding while building, not part of the shipped feature: - ci-driver-demo.ts offline no-agent control-loop demo (covered by tests) - ci-driver-live-agent.ts manual LLM-drives-MCP proof (needs a key) - record-demo.no-jest.ts offline sample-recording generator (real --e2e records) Keep the three the workbench actually orchestrates: e2e-full-run, render-snapshots, replay-e2e. Update scripts/README.md + ARCHITECTURE.md accordingly. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

EXPLORING-AS-AN-AGENT.md — a runbook for an agent that wants to run/drive/explore the wizard headlessly: ask the user for a key file path + set env, then either a full `wizard-ci --e2e` run or a hand-driven read_state→perform_action loop, with renderFrame to snapshot the TUI for itself to view. Gives wizard-ci-tools its documented use (agentic exploration). Recipe smoke-tested (intro → health-check, renders the real screen). ARCHITECTURE.md points at it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…nt behavior - README: add "Explore with an agent" under Running locally → Testing (was wrongly placed in the workbench README). - scripts/README: drop the cross-PR pointer to the #703 repro scripts. - Trim header/inline comments across the harness + scripts to concise descriptions of what the code does now — no history, no change-rationale. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Move e2e-harness/EXPLORING-AS-AN-AGENT.md into .claude/skills/exploring-the-wizard/ so an agent auto-discovers it. Repoint the README + ARCHITECTURE links and list it in AGENTS.md. ARCHITECTURE.md stays co-located as the how-it-works reference. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…-by-turn scripts/wizard-ci-mcp.no-jest.ts is a stdio MCP server over one live WizardStore: read_state / list_actions / perform_action / render_screen / run_agent. An agent registers it and makes every decision live, instead of the static scripted run. Rewrite the exploring-the-wizard skill to lead with this. Bump zod ^3.24→^3.25 (the MCP SDK needs the zod/v3 subpath; non-breaking) and add the SDK as a dep. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Same resolved version; just the package.json floor, so #701 and #702 don't conflict on the zod line. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

read_state already returns the legal actions, so the separate tool is noise. Keeps the server's surface minimal: read_state, perform_action, render_screen, run_agent. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…hange Running prettier on these (not in lint-staged) reflowed the whole files — pure diff noise. Restore them to main and re-apply just the intended edits: the "Explore with an agent" section + the exploring-the-wizard skill row.

…d runbook EXPLORING-AS-AN-AGENT.md was promoted to .claude/skills/exploring-the-wizard/; this pointer fix was left uncommitted, so HEAD still linked the deleted file.

…ion start The skill told agents to `claude mcp add` then immediately call the tools, which is impossible (MCP servers load at session start), so agents fell back to a script. Lead with the in-session way that actually works — a WizardCiDriver script (read_state → perform_action → renderFrame), tested — and document the MCP server as the interactive option that needs registering before a fresh session.

…with it Connect the stdio transport first and build the store lazily on the first tool call — detection + the networked health probe used to run before connect(), which could stall the MCP handshake so Claude Code saw the server as broken. Verified end-to-end: `claude mcp add` → `claude mcp list` shows ✔ Connected → a headless session drove read_state → perform_action(confirm_setup) → auth → render_screen. Skill now leads with the two-phase MCP flow (register, then drive in a fresh session, since MCP tools bind at session start); the driver script is the fallback.

…drives in one session Register wizard-ci in .mcp.json so its tools are bound in every session in this repo. An agent following the exploring-the-wizard skill now drives the wizard over MCP (open_app -> read_state -> perform_action -> render_screen -> run_agent) without registering anything or starting a fresh session. The server boots app-agnostic; open_app picks the app + key at call time, so the committed config holds no secrets. Skill + README rewritten to the one-session MCP flow. Verified: a fresh headless agent given only the skill drove the wizard with four MCP calls and wrote zero scripts. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Just say to point appDir at the directory that has the package.json. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

appDir is just the throwaway copy of the app; let the agent find the path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

auth (and run) are NO_ACTION screens: session.credentials is set only inside bootstrapProgram, which runs via run_agent. So nothing advances past auth without run_agent — but the tool description said "call when currentScreen=run" and the skill walk skipped auth, so an agent landed on auth and polled instead of calling run_agent. Fix the run_agent description and the skill walk/key-facts to say run_agent bootstraps creds and advances auth+run; don't poll those screens. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ves the run A real run_agent call blocked the stdio MCP server for ~3 minutes; the client treated the server as unhealthy, reconnected, and the restarted process lost its in-memory store ("No app open", runPhase reset to idle). run_agent now starts the integration in the background and returns immediately; read_state stays responsive and reports runPhase running -> completed plus an integration status, so the agent polls instead of blocking. Skill + tool descriptions updated to the poll model; noted that run_agent creates real PostHog resources each run. Proven: run_agent returns in 0.0s; read_state during the run answers in 1-2ms with runPhase=running. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…or both routes Both e2e routes run the real wizard TUI (startTUI) driven by store state manipulation — no keystrokes — and capture the real rendered screen from a PTY. Auth is satisfied by setCredentials with the phx key (same bearer as an OAuth token), so the TUI advances with no browser. - e2e-harness/tui-capture.ts — run a command in a PTY (node-pty), read its screen via @xterm/headless. - scripts/tui-host.no-jest.ts — the real-TUI host. MODE=fixed self-drives the fixed e2e profile, signals each screen, writes a structured result JSON; MODE=serve takes drive commands over a unix socket. - scripts/tui-snapshots.no-jest.ts — CI route: real-TUI text snapshot per screen. - scripts/wizard-ci-mcp.no-jest.ts — agent route: MCP server proxying the host. - scripts/wizard-ci-explore.no-jest.ts — drive the MCP route, print the real TUI. - scripts/tui-replay.no-jest.ts — replay captured snapshots in the terminal. Deletes the record-then-reconstruct machinery (recorder, replay, e2e-full-run, render-snapshots, replay-e2e) and the in-process wizard-ci-tools server. Adds node-pty + @xterm/headless. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…sition Snapshot on key moments — a screen change, a task-list update, or a runPhase change — via a store subscription, and snap each screen before the driver acts on it. The run screen (the agent working) is captured as it progresses, and fast transitions (intro/auth/outro/mcp/slack) are no longer skipped by throttling. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ed loop Snapshot on every key-moment change (no throttle spacing, just a settle). And don't await the driver loop at exit — on the cheap (no-agent) path it's parked in waitForChange, so awaiting it hung the process and exited non-zero, which would fail CI. The process now exits 0 cleanly. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The fixed CI route always drives the full real agent run — a no-agent path was pointless (and is what hung at exit). Removes the RUN_AGENT branch and the auth-by-state shortcut it needed in fixed mode; auth is bootstrapped by the run. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

node-pty ships no linux-x64 prebuilt, so CI must compile it; pnpm 10 blocks build scripts unless allowlisted. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

ink renders non-interactively when it detects CI (CI / GITHUB_ACTIONS), leaving the captured xterm buffer blank. Strip them from the spawned host's env. Verified locally: with CI=true, render_screen now returns the real TUI instead of blank. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

gewenyu99 and others added 2 commits June 21, 2026 11:11

docs(ci-driver): point the agent guide at the extracted e2e profile file

1c2dca8

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

gewenyu99 mentioned this pull request Jun 21, 2026

fix(security): stop ANTHROPIC_BASE_URL settings overrides redirecting the agent off the PostHog gateway #703

Draft

gewenyu99 and others added 12 commits June 22, 2026 12:20

Merge remote-tracking branch 'origin/main' into e2e-control-plane

b0f4e53

docs(e2e-harness): cross-link the workbench visual-snapshots flow + env

e44fe55

ARCHITECTURE.md now documents the wizard-ci-snapshots visual-regression flow (real run → render → diff → side-by-side report) and the env it needs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

gewenyu99 marked this pull request as ready for review June 22, 2026 20:57

Merge branch 'main' into e2e-control-plane

0e63a30

gewenyu99 commented Jun 22, 2026

View reviewed changes

revert: drop the explanatory comments from source

cb439ff

Moving the trace / never-ships / credentials notes to PR review comments anchored to the lines instead — keep the source uncluttered. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

gewenyu99 commented Jun 22, 2026

View reviewed changes

gewenyu99 requested a review from a team June 22, 2026 21:06

gewenyu99 commented Jun 22, 2026

View reviewed changes

gewenyu99 and others added 17 commits June 22, 2026 17:35

chore: align zod spec to ^3.25.76 (matches the pi stack #701)

59ffdc1

Same resolved version; just the package.json floor, so #701 and #702 don't conflict on the zod line. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

docs: fix dead link — point ARCHITECTURE at the skill, not the delete…

119dba0

…d runbook EXPLORING-AS-AN-AGENT.md was promoted to .claude/skills/exploring-the-wizard/; this pointer fix was left uncommitted, so HEAD still linked the deleted file.

docs(e2e-harness): drop "monorepo" wording from open_app guidance

d50da8d

Just say to point appDir at the directory that has the package.json. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

docs(e2e-harness): drop the app-dir hand-holding from open_app

332d9d6

appDir is just the throwaway copy of the app; let the agent find the path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

docs: drop stale e2e-full-run reference from a comment

e120b2a

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

gewenyu99 changed the title ~~feat(ci-driver): wizard-ci-tools control plane for headless e2e + record/replay~~ feat(e2e-harness): drive and snapshot the real wizard TUI Jun 23, 2026

gewenyu99 mentioned this pull request Jun 23, 2026

feat(wizard-ci): real-TUI e2e + snapshot review PostHog/wizard-workbench#2012

Draft

gewenyu99 and others added 5 commits June 22, 2026 22:15

build: allow node-pty's build script (compiles pty.node on Linux CI)

4ed8691

node-pty ships no linux-x64 prebuilt, so CI must compile it; pnpm 10 blocks build scripts unless allowlisted. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(e2e-harness): drive and snapshot the real wizard TUI#702

feat(e2e-harness): drive and snapshot the real wizard TUI#702
gewenyu99 wants to merge 40 commits into
mainfrom
e2e-control-plane

gewenyu99 commented Jun 21, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 21, 2026

Uh oh!

gewenyu99 Jun 22, 2026

Uh oh!

gewenyu99 Jun 22, 2026

Uh oh!

gewenyu99 Jun 22, 2026

Uh oh!

gewenyu99 Jun 22, 2026

Uh oh!

gewenyu99 Jun 22, 2026

Uh oh!

gewenyu99 Jun 22, 2026

Uh oh!

gewenyu99 Jun 22, 2026

Uh oh!

gewenyu99 Jun 22, 2026

Uh oh!

gewenyu99 Jun 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gewenyu99 commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How to test

What this is

Uh oh!

github-actions Bot commented Jun 21, 2026

🧙 Wizard CI

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gewenyu99 Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gewenyu99 commented Jun 21, 2026 •

edited

Loading

gewenyu99 Jun 22, 2026 •

edited

Loading