fix(plugin)!: invoke lumen via node launcher with lazy binary fetch#154
fix(plugin)!: invoke lumen via node launcher with lazy binary fetch#154
Conversation
Adds a Go integration test that mirrors the consumer invocation path used by Claude Code, Cursor, OpenCode, and Codex MCP hosts: exec.Command on Unix calls posix_spawn(2); on Windows it calls CreateProcessW. No shell sits in the loop, so any byte-0 / shebang / extension regression in the launcher chain fails the test deterministically. The bash-fallback masking that lets the existing scripts/test_run*.sh tests pass against broken polyglots no longer hides the failure. The gate is Phase 1 of the run.cmd polyglot elimination plan: lock the gate in place before any artifact change, so subsequent migrations are caught by CI rather than reaching users (the regression class behind ory#152 and the obra/superpowers v4.3.0 → v4.3.1 → v4.3.2 cycle referenced in anthropics/claude-plugins-official#1692). Phase 1 ships: - launcher.mjs at repo root — minimal Node ESM (~50 lines, no deps) that resolves bin/lumen[.exe] from the plugin root or LUMEN_BIN_PATH override and spawns it with passed args. Phase 2 will enrich this with lazy GitHub Releases fetch + SHA-256 verification + per-host cache; Phase 1 keeps it minimal to land the test gate in isolation. - internal/integration/launcher_spawn_test.go (build tag `integration`): - TestLauncherSpawnStdioHandshake drives a real MCP initialize handshake via mcp.CommandTransport against `node launcher.mjs stdio`, asserting a clean JSON-RPC stream. - TestLauncherSpawnHookSessionStart (claude + cursor subtests) runs the SessionStart hook subcommand and verifies parseable JSON output with clean exit. - TestLauncherSpawnHookPreToolUse exercises the second hook path. - internal/integration/helpers_test.go: repoRoot, launcherPath, lookupBuiltBinary, nodePath, sandboxEnv. sandboxTempDir tolerates Windows file-locking errors caused by the SessionStart hook's detached background indexer (cmd/hook_spawn_windows.go) keeping debug.log / index.db open past test end — the test logic completes successfully, only t.TempDir's strict cleanup was reporting the residual handle as a test failure. CI additions (.github/workflows/ci.yml): - build job uploads bin/lumen as artifact lumen-${matrix.os} for the integration job to consume. - build-windows-cross delegates to `make build` so the cross-compile recipe lives only in .goreleaser.yml (invoked through the oryd/xgoreleaser docker image already used by every release). ci.yml does not duplicate apt-get/cp/CC overrides. Extracts the windows-amd64 binary from `dist/` and uploads as artifact lumen-windows-amd64. - integration job (matrix [ubuntu-latest, macos-latest, windows-latest]) depends on both build jobs, downloads the matching artifact, sets up Node, restores the execute bit on Unix, and runs `go test -tags=integration`. Existing polyglot script-test steps remain in place — defensive coverage during the Phase 2 → Phase 3 soak window in case any phase needs revert.
📝 WalkthroughWalkthroughReplaces platform-specific launcher scripts with a Node.js launcher ( ChangesLauncher & Integration
Sequence DiagramsequenceDiagram
actor Client as Client
participant Launcher as launcher.mjs
participant Cache as Local\ Cache
participant GitHub as GitHub\ Releases
participant Binary as Native\ Binary
rect rgba(200,150,255,0.5)
Note over Launcher,Binary: First run / cache miss
Client->>Launcher: spawn(stdio | hook)
Launcher->>Cache: check for cached binary
Cache-->>Launcher: not found
Launcher->>GitHub: fetch checksums.txt
GitHub-->>Launcher: checksums.txt
Launcher->>GitHub: download asset (stream)
GitHub-->>Launcher: binary stream
Launcher->>Launcher: verify SHA-256 and write .partial
Launcher->>Cache: atomic rename to final path
Cache-->>Launcher: cached binary path
Launcher->>Binary: spawn with inherited stdio
Binary-->>Client: stdio/response
end
rect rgba(150,200,150,0.5)
Note over Launcher,Cache: Cached run (cache hit)
Client->>Launcher: spawn(stdio | hook)
Launcher->>Cache: check for cached binary
Cache-->>Launcher: cached binary path
Launcher->>Binary: spawn with inherited stdio
Binary-->>Client: stdio/response
end
rect rgba(200,200,150,0.5)
Note over Client,Launcher: Concurrent prefetch (lock safety)
Client->>Launcher: prefetch
Client->>Launcher: prefetch
Client->>Launcher: prefetch
Launcher->>Launcher: acquire .lock (winner)
par Waiters
Launcher->>GitHub: fetch (waiters sleep/retry)
Launcher->>GitHub: fetch (waiters sleep/retry)
Launcher->>GitHub: fetch (winner proceeds)
end
Launcher->>Cache: write binary (winner)
Launcher->>Launcher: release .lock
Launcher-->>Client: prefetch OK
Launcher-->>Client: prefetch OK
Launcher-->>Client: prefetch OK
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
A couple of things I'd like to flag before review: The breaking-change marker ( Also, this PR intentionally leaves |
|
|
||
| - [Codex CLI](https://developers.openai.com/codex/cli) | ||
| - Git | ||
| - **Node.js 18+** — Lumen's MCP entry point is `launcher.mjs`, a small Node |
There was a problem hiding this comment.
Node 18 is very old, you should not recommend or require that.
There was a problem hiding this comment.
Good catch — Node 18 EOL'd in April 2025; recommending it as the floor was wrong. Bumped both .codex/INSTALL.md and .opencode/INSTALL.md to Node.js 22+ (current Active LTS) in the latest force-push. Thanks for the review.
0a64464 to
ea32ba9
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (4)
hooks/hooks-cursor.json (1)
4-9: 💤 Low valueOptional: consider mirroring the Claude
prefetchstep for cold-cache first-run experience.
hooks/hooks.jsonruns aprefetchhook before the actualsession-startto avoid the first session paying the ~30 MB GitHub Releases download cost in the user-visible startup window. Cursor's hook system does not appear to have a documented hard timeout the way OpenCode (5 s) and Codex (10 s default) do, so this may be unnecessary — but a first-run user with a slow link will still see a multi-second pause beforesemantic_searchbecomes available. If symmetry with Claude's flow is desired, add aprefetchstep before this hook.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@hooks/hooks-cursor.json` around lines 4 - 9, The session-start hook defined under "sessionStart" currently launches the Cursor plugin via the command node "${CURSOR_PLUGIN_ROOT}/launcher.mjs" (hook session-start lumen --host cursor); to mirror Claude's cold-cache prefetch behavior add a preceding "prefetch" hook entry that runs the same launcher in prefetch mode (i.e., invoke the launcher with the prefetch action or flag your launcher supports) so the heavy download happens before user-visible startup; update the hooks-cursor.json to include this "prefetch" entry before "sessionStart" so semantic_search is primed for first-run users.launcher.mjs (2)
162-186: 💤 Low valueClean up partial file and writable stream on download errors.
The
tmpPathcleanup is only run on the SHA-256 mismatch path. Ifreserrors mid-stream (network reset, idle timeout fires aftersetTimeoutat line 130-132), the promise rejects buttmpPathis left on disk and theoutwritable is not explicitly destroyed. Later runs reusetmpPath(truncated bycreateWriteStream), so it is bounded — but explicit cleanup is cheap and avoids stale partials surviving when the destination dir is cache-shared.♻️ Proposed cleanup on stream error
async function downloadAndVerify(url, expectedSha256, destPath) { const tmpPath = `${destPath}.partial`; fs.mkdirSync(path.dirname(destPath), { recursive: true }); const res = await httpsGet(url); const hash = createHash("sha256"); - await new Promise((resolve, reject) => { - const out = fs.createWriteStream(tmpPath); - res.on("data", (chunk) => hash.update(chunk)); - res.on("error", reject); - out.on("error", reject); - out.on("finish", resolve); - res.pipe(out); - }); + try { + await new Promise((resolve, reject) => { + const out = fs.createWriteStream(tmpPath); + const fail = (err) => { + res.destroy(); + out.destroy(); + reject(err); + }; + res.on("data", (chunk) => hash.update(chunk)); + res.on("error", fail); + out.on("error", fail); + out.on("finish", resolve); + res.pipe(out); + }); + } catch (e) { + try { fs.unlinkSync(tmpPath); } catch {} + throw e; + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@launcher.mjs` around lines 162 - 186, The downloadAndVerify function leaves tmpPath and the writable stream out around network or stream errors; modify the code so that the writable stream (out) is properly destroyed/closed and tmpPath is removed whenever the download promise rejects (not just on SHA mismatch). Concretely, ensure the Promise that wraps res.pipe(out) sets up error handlers to destroy out and unlink tmpPath on res or out errors, or wrap the await new Promise in try/catch/finally to call out.destroy() (if created) and fs.unlinkSync(tmpPath) on error, keeping the SHA-256 check logic intact; reference downloadAndVerify, tmpPath, res, out and the existing httpsGet call to locate where to add this cleanup.
289-300: ⚡ Quick winForward signals to the child to avoid orphaning the lumen process.
With
stdio: "inherit", terminal-driven SIGINT reaches both processes via the foreground process group, so Ctrl+C works. But signals delivered only to the launcher PID (e.g.,kill <pid>, host process supervisor sending SIGTERM, IDE shutting down its hook) are not forwarded — the spawnedlumenkeeps running until it notices stdin closed, leaking a process and its sqlite file lock. Also,child.on("error")andchild.on("exit")can both fire;fatal()afterprocess.exit()is harmless but worth guarding for clarity.♻️ Proposed signal forwarding
const child = spawn(bin, args, { stdio: "inherit" }); + const forward = (sig) => { + try { child.kill(sig); } catch {} + }; + for (const sig of ["SIGINT", "SIGTERM", "SIGHUP", "SIGQUIT"]) { + process.on(sig, () => forward(sig)); + } child.on("exit", (code, signal) => { if (signal && process.platform !== "win32") { process.kill(process.pid, signal); } else { process.exit(code ?? (signal ? 1 : 0)); } }); child.on("error", (err) => { fatal(`failed to spawn binary: ${err.message}`, 126); });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@launcher.mjs` around lines 289 - 300, The launcher currently spawns the child (const child = spawn(...)) and only handles child.on("exit")/child.on("error"); add signal forwarding so signals sent to the launcher are forwarded to the child to avoid orphaning the lumen process: register handlers for SIGINT, SIGTERM, SIGHUP (and SIGQUIT if desired) that call child.kill(signal) (with Windows-safe behavior: avoid unsupported signals on win32 or fallback to child.kill() for termination), and remove those handlers once child emits "exit" to avoid leaks; also guard the fatal(...) call in child.on("error")/child.on("exit") by ensuring you don't call fatal after process.exit (use an exited flag or return early in handlers) so the process flow is clear..opencode/plugins/lumen.js (1)
21-36: ⚡ Quick winEscalate to SIGKILL after the timeout, otherwise a wedged child can hang OpenCode startup forever.
child.kill()sends SIGTERM by default. Iflauncher.mjs(or the lumen binary it eventually spawns) is hung in a state that ignores or delays SIGTERM, theexit/errorevents never fire and the awaited promise never resolves — defeating the very 90s cap this code is designed to enforce. Resolve immediately on the timer (the goal is "don't block startup") or chain a SIGKILL after a short grace window.♻️ Proposed timeout escalation
async function prefetch() { await new Promise((resolve) => { const child = spawn(process.execPath, [launcher, "prefetch"], { stdio: "inherit", }); - const timer = setTimeout(() => { - child.kill(); - }, PREFETCH_TIMEOUT_MS); + let killTimer; + const timer = setTimeout(() => { + child.kill("SIGTERM"); + killTimer = setTimeout(() => { + try { child.kill("SIGKILL"); } catch {} + resolve(); // unblock startup regardless + }, 5000); + }, PREFETCH_TIMEOUT_MS); const done = () => { clearTimeout(timer); + if (killTimer) clearTimeout(killTimer); resolve(); }; child.on("exit", done); child.on("error", done); // best-effort: don't block opencode startup on fetch failure }); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.opencode/plugins/lumen.js around lines 21 - 36, The prefetch promise can hang because child.kill() sends SIGTERM and the child may ignore it; change the timeout handler to (a) resolve the outer promise immediately to avoid blocking startup, and (b) attempt a graceful kill followed by a short escalation window that sends SIGKILL if the process hasn’t exited; update the timeout closure that uses PREFETCH_TIMEOUT_MS to call child.kill() (SIGTERM), call resolve(), and set a secondary timer (e.g., 200–500ms) that calls child.kill("SIGKILL"); ensure you still clear both timers and remove listeners (the done handler on child.exit and child.error) when the child finally exits to avoid leaks; target the prefetch function and the child/timer variables in your change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/ci.yml:
- Around line 164-170: The workflow step "Stage windows binary for upload"
currently copies every lumen.exe found by the find command (find dist -type f
-name 'lumen.exe' -exec cp {} bin/lumen.exe \;) which can select the wrong arch;
change the selection to only the amd64 build by matching the windows amd64
output (e.g., replace the find pattern with one that targets the
dist/*_windows_amd64/lumen.exe path or use find with -path
'*/windows_amd64/lumen.exe') so mkdir -p bin then a single deterministic copy of
the amd64 artifact to bin/lumen.exe is performed and leave the final test -x
bin/lumen.exe as-is.
In `@internal/integration/launcher_prefetch_test.go`:
- Around line 44-45: The test currently ignores errors from
http.NewRequestWithContext and io.ReadAll which can lead to nil requests or
truncated bodies and obscure failures; update the test to capture and check the
error returned by http.NewRequestWithContext (the calls that feed
http.DefaultClient.Do) and fail the test immediately (e.g., t.Fatalf or
require.NoError) if request construction fails, and likewise capture and check
the error from io.ReadAll before using the body (failing the test if reading the
response body errors); apply the same checks to the other occurrences flagged
(the similar NewRequestWithContext and io.ReadAll calls around lines 133-134 and
142-143) so no errors are silently ignored.
---
Nitpick comments:
In @.opencode/plugins/lumen.js:
- Around line 21-36: The prefetch promise can hang because child.kill() sends
SIGTERM and the child may ignore it; change the timeout handler to (a) resolve
the outer promise immediately to avoid blocking startup, and (b) attempt a
graceful kill followed by a short escalation window that sends SIGKILL if the
process hasn’t exited; update the timeout closure that uses PREFETCH_TIMEOUT_MS
to call child.kill() (SIGTERM), call resolve(), and set a secondary timer (e.g.,
200–500ms) that calls child.kill("SIGKILL"); ensure you still clear both timers
and remove listeners (the done handler on child.exit and child.error) when the
child finally exits to avoid leaks; target the prefetch function and the
child/timer variables in your change.
In `@hooks/hooks-cursor.json`:
- Around line 4-9: The session-start hook defined under "sessionStart" currently
launches the Cursor plugin via the command node
"${CURSOR_PLUGIN_ROOT}/launcher.mjs" (hook session-start lumen --host cursor);
to mirror Claude's cold-cache prefetch behavior add a preceding "prefetch" hook
entry that runs the same launcher in prefetch mode (i.e., invoke the launcher
with the prefetch action or flag your launcher supports) so the heavy download
happens before user-visible startup; update the hooks-cursor.json to include
this "prefetch" entry before "sessionStart" so semantic_search is primed for
first-run users.
In `@launcher.mjs`:
- Around line 162-186: The downloadAndVerify function leaves tmpPath and the
writable stream out around network or stream errors; modify the code so that the
writable stream (out) is properly destroyed/closed and tmpPath is removed
whenever the download promise rejects (not just on SHA mismatch). Concretely,
ensure the Promise that wraps res.pipe(out) sets up error handlers to destroy
out and unlink tmpPath on res or out errors, or wrap the await new Promise in
try/catch/finally to call out.destroy() (if created) and fs.unlinkSync(tmpPath)
on error, keeping the SHA-256 check logic intact; reference downloadAndVerify,
tmpPath, res, out and the existing httpsGet call to locate where to add this
cleanup.
- Around line 289-300: The launcher currently spawns the child (const child =
spawn(...)) and only handles child.on("exit")/child.on("error"); add signal
forwarding so signals sent to the launcher are forwarded to the child to avoid
orphaning the lumen process: register handlers for SIGINT, SIGTERM, SIGHUP (and
SIGQUIT if desired) that call child.kill(signal) (with Windows-safe behavior:
avoid unsupported signals on win32 or fallback to child.kill() for termination),
and remove those handlers once child emits "exit" to avoid leaks; also guard the
fatal(...) call in child.on("error")/child.on("exit") by ensuring you don't call
fatal after process.exit (use an exited flag or return early in handlers) so the
process flow is clear.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 183d05cb-fe3b-4ff4-9dfb-7c1bc24a5d7c
📒 Files selected for processing (14)
.claude-plugin/plugin.json.codex/INSTALL.md.cursor-plugin/INSTALL.md.cursor/mcp.json.github/workflows/ci.yml.opencode/INSTALL.md.opencode/plugins/lumen.jsREADME.mdhooks/hooks-cursor.jsonhooks/hooks.jsoninternal/integration/helpers_test.gointernal/integration/launcher_prefetch_test.gointernal/integration/launcher_spawn_test.golauncher.mjs
Migrates every host's mcpServers / hooks command from \`scripts/run.cmd\` (the polyglot whose byte-0 conflict caused ory#152 and the same regression class superpowers cycled through three patches for) to \`node \${PLUGIN_ROOT}/launcher.mjs\`. Phase 2 of the run.cmd polyglot elimination plan, building on the Phase 1 native-spawn integration test gate from the preceding commit. After this commit no live code path reaches \`scripts/run.cmd\`; the polyglot file remains in the tree as dead code through the soak window before Phase 3 deletes it. BREAKING CHANGE: the MCP server command and hook commands now invoke \`node\` instead of a shell-mediated launcher script. Users on Claude Code, Cursor, OpenCode, and Codex must have Node 18+ on the spawn PATH. Most users already do (Claude Code, Cursor, OpenCode all ship Node; Codex's official MCP examples assume \`npx\`). Codex users on a pure Homebrew install with no Node should follow the new .codex/INSTALL.md troubleshooting note. What launcher.mjs does now: - Resolution chain: \$LUMEN_BIN_PATH → <pluginRoot>/bin/lumen[.exe] (developer fast path) → <cacheDir>/bin/<version>/lumen-<version>- <goos>-<goarch>[.exe] (lazy fetch). - Cache dir: \$CLAUDE_PLUGIN_DATA → \$CURSOR_PLUGIN_DATA → \$XDG_DATA_HOME (Unix) / \$LOCALAPPDATA (Windows) → ~/.local/share / os.tmpdir(). - Lazy fetch from GitHub Releases on cache miss. checksums.txt parsed, SHA-256 verified before atomic rename — non-zero exit on mismatch. - File lock at <cacheDir>/.lock (15-min stale recovery) serializes concurrent first-runs. - Per-socket idle timeout (60s) on both request and response streams, so hung connections / mid-stream stalls error out instead of blocking the launcher and any host waiting on its handshake. - Node platform/arch ↔ Go GOOS/GOARCH mapping (darwin/x64 → darwin/amd64 etc.) so asset names match goreleaser. - \`prefetch\` subcommand for cache-warming hooks; \$LUMEN_LAUNCHER_VERBOSE enables stderr diagnostics. - ~280 lines, Node stdlib only, zero third-party dependencies. Per-host migration: - .claude-plugin/plugin.json: command → \${user_config.node_command} (default \"node\"), args → [\${PLUGIN_ROOT}/launcher.mjs, stdio]. New userConfig.node_command lets users with PATH issues set an absolute path without editing the manifest. - hooks/hooks.json: SessionStart now runs prefetch then session-start, warming the cache before the first MCP server invocation. Both hook commands use \${user_config.node_command}. - .cursor/mcp.json + hooks/hooks-cursor.json: command → \"node\", args → launcher.mjs path. Cursor's manifest doesn't support \${user_config.*} substitution; the absolute-path workaround is inline in the docs. - .opencode/plugins/lumen.js: spawn target swapped to [process.execPath, launcher, ...]. New prefetch() runs synchronously before config.mcp.lumen registration with a 90s hard timeout cap so a wedged launcher can't hang opencode startup. OpenCode's MCP startup window (5s documented, 15-30s observed) is too tight for cold-cache fetches; pre-warming during config load bypasses it. - .codex/INSTALL.md: register with \`--startup-timeout-sec 60\` (Codex default 10s is borderline tight). Documents Node 18+ prerequisite, PATH-visibility troubleshooting (codex#6243 absolute-path workaround), config.toml persistence, manual prefetch warm-up, and cache directory location for clean uninstalls. - README.md, .opencode/INSTALL.md, .cursor-plugin/INSTALL.md: install snippets updated to launcher.mjs invocation. New integration tests (build tag \`integration\`): - TestLauncherPrefetch: downloads the matching release asset, verifies SHA-256 against checksums.txt, asserts executable bit, asserts lock cleanup; re-runs and asserts cache hit by mtime. - TestLauncherConcurrentPrefetch: three parallel prefetch processes against a shared cache, asserts only one downloads, no .partial remnants, all exit 0. - isolatedLauncher helper copies launcher.mjs + manifest into a fresh temp dir without a bin/ subdir so the developer fast path doesn't bypass the cache logic the test wants to exercise. - ensureNetwork helper skips network-dependent tests gracefully when github.com is unreachable. What this commit does NOT do: - Phase 3 (delete scripts/run.{cmd,sh,bat} + their tests) — waits for the soak window (two tagged releases or 14 days). - npm \`optionalDependencies\` migration for @ory/lumen-opencode — separate effort, requires distribution-packaging decisions. Closes the structural cause of ory#152.
ea32ba9 to
93e2e55
Compare
|
Addressed in the latest force-push ( Actionable — both applied:
Nitpicks — applied:
Nitpick declined — I'd skip this one. Cursor's MCP startup window (using the upstream MCP TypeScript SDK default |
|
Tip For best results, initiate chat on the files or code changes.
The reasoning for declining the Cursor prefetch nitpick is technically well-founded — Cursor's Triggering a re-review of the force-pushed changes. [review] ✏️ Learnings added
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
.opencode/plugins/lumen.js (1)
21-58: 💤 Low valueOptional:
unref()the child and consider piping its stdio.The prefetch flow is solid (idempotent resolve, SIGTERM→SIGKILL escalation, error/exit both resolve best-effort). Two small, optional refinements:
- After timeout-resolves, the surviving child plus the 500ms
escalateTimerkeep the event loop alive. If opencode shuts down mid-prefetch, this can briefly delay shutdown or leak a child.child.unref()after spawn limits that exposure.stdio: "inherit"surfaces every prefetch log line into opencode's terminal during plugin load. On cold cache that's a download progress dump on first run; on warm cache it's silent. If the launcher already gates verbosity on$LUMEN_LAUNCHER_VERBOSE, you could pipe and discard by default to keep load quiet.Both are deferable — current behavior is correct and bounded.
♻️ Sketch of the unref + quieter stdio change
- const child = spawn(process.execPath, [launcher, "prefetch"], { - stdio: "inherit", - }); + const verbose = !!process.env.LUMEN_LAUNCHER_VERBOSE; + const child = spawn(process.execPath, [launcher, "prefetch"], { + stdio: verbose ? "inherit" : "ignore", + }); + child.unref();🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.opencode/plugins/lumen.js around lines 21 - 58, The prefetch child process keeps the event loop alive and prints directly to the parent terminal; after spawning the child in prefetch() (variable child from spawn(..., { stdio: "inherit" })), call child.unref() so the child won't block shutdown after the timeout/escalation timers (PREFETCH_TIMEOUT_MS and escalateTimer), and change stdio from "inherit" to a piped option (e.g., "ignore" or "pipe") so prefetch logs don't always surface during plugin load—adjust any needed listeners if you choose "pipe" to preserve debug verbosity via LUMEN_LAUNCHER_VERBOSE.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In @.opencode/plugins/lumen.js:
- Around line 21-58: The prefetch child process keeps the event loop alive and
prints directly to the parent terminal; after spawning the child in prefetch()
(variable child from spawn(..., { stdio: "inherit" })), call child.unref() so
the child won't block shutdown after the timeout/escalation timers
(PREFETCH_TIMEOUT_MS and escalateTimer), and change stdio from "inherit" to a
piped option (e.g., "ignore" or "pipe") so prefetch logs don't always surface
during plugin load—adjust any needed listeners if you choose "pipe" to preserve
debug verbosity via LUMEN_LAUNCHER_VERBOSE.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 2a0dc7a4-747a-4390-b0eb-deebdf157646
📒 Files selected for processing (12)
.claude-plugin/plugin.json.codex/INSTALL.md.cursor-plugin/INSTALL.md.cursor/mcp.json.github/workflows/ci.yml.opencode/INSTALL.md.opencode/plugins/lumen.jsREADME.mdhooks/hooks-cursor.jsonhooks/hooks.jsoninternal/integration/launcher_prefetch_test.golauncher.mjs
✅ Files skipped from review due to trivial changes (5)
- hooks/hooks-cursor.json
- .cursor-plugin/INSTALL.md
- hooks/hooks.json
- .opencode/INSTALL.md
- .claude-plugin/plugin.json
🚧 Files skipped from review as they are similar to previous changes (6)
- README.md
- .cursor/mcp.json
- .github/workflows/ci.yml
- launcher.mjs
- internal/integration/launcher_prefetch_test.go
- .codex/INSTALL.md
|
@hypn4 @jkoenig134 Seems we were both working on the same issue and I forgot to check if anyone else had pushed today. Please see put a comparison and my recommendation here #155 (comment) (yes to many of @hypn4's changes) My changes are smaller for an initial fix to get MacOS/Unix working again. However, I do think this PR has a lot of value, just many of the extra additions should be split up outside of the main fix PR. @hypn4 added a strong regression test in #153 ( Good work! |
Summary
Migrates every host's
mcpServers/ hooks command fromscripts/run.cmd(the polyglot whose byte-0 conflict caused #152 and the same regression class superpowers cycled through three patches for) tonode ${PLUGIN_ROOT}/launcher.mjs.This is Phase 2, building on the Phase 1 native-spawn integration test gate in #153. After this PR no live code path reaches
scripts/run.cmd; the polyglot remains in the tree as dead code through the soak window before a separate follow-up PR deletes it.Fixes #152.
Why a Node launcher (industry context)
Three data points show the polyglot pattern is structurally fragile:
The byte 0 of a single file can be exactly one of
#,:, or@— mutually exclusive. No polyglot byte 0 satisfies POSIXposix_spawn, Windowscmd.exeextension dispatch, and shell parsing all at once. Eliminating the polyglot is the only structural fix.A small Node launcher resolves the OS dispatch in JavaScript and spawns the matching native binary. JS handles platform branching; no shell sits in the loop; the byte-0 conflict goes away. This shares the launcher-resolves-OS-and-spawns-binary shape with
@anthropic-ai/claude-codeitself, which preinstalls per-platform binaries via npmoptionalDependenciesrather than lazy-fetching them at first run — the launcher half is the structural-fix common ground; the distribution mechanism (lazy fetch vs preinstall) is the part lumen could migrate to later if you'd prefer.What launcher.mjs does
$LUMEN_BIN_PATH→<pluginRoot>/bin/lumen[.exe](developer fast path frommake build-local) →<cacheDir>/bin/<version>/lumen-<version>-<goos>-<goarch>[.exe](lazy fetch).$CLAUDE_PLUGIN_DATA→$CURSOR_PLUGIN_DATA→$XDG_DATA_HOME(Unix) /$LOCALAPPDATA(Windows) →~/.local/share/os.tmpdir()last resort.checksums.txtis parsed and SHA-256 verified before atomic rename — non-zero exit on mismatch.<cacheDir>/.lock(15-min stale recovery) serializes concurrent first-runs. Tolerates WindowsEPERMtransient state during another process's unlink.darwin/x64→darwin/amd64etc.) so asset names match goreleaser.prefetchsubcommand for cache-warming hooks.$LUMEN_LAUNCHER_VERBOSEenables stderr diagnostics.Per-host migration
mcpServers+ hooks)command: "node", args →[${CLAUDE_PLUGIN_ROOT}/launcher.mjs, …]. SessionStart hook now runsprefetchthensession-start, warming the cache before the MCP first invocation.mcp.json+ hooks)command: "node", args → launcher path..opencode/plugins/lumen.js)[process.execPath, launcher, …]. Newprefetch()runs synchronously beforeconfig.mcp.lumenregistration with a 90 s hard timeout cap, sidestepping OpenCode's tight 5 s MCP startup window for cold-cache fetches..codex/INSTALL.md)codex mcp add lumen --startup-timeout-sec 60 -- node "$CODEX_HOME/lumen/launcher.mjs" stdio. Documents Node 18+ prerequisite, PATH-visibility troubleshooting (codex#6243), config.toml persistence, manual prefetch warm-up, and cache directory location for clean uninstalls.README.md,.opencode/INSTALL.md,.cursor-plugin/INSTALL.mdupdated likewise.What this PR does NOT do
scripts/run.{cmd,sh,bat}and their tests — waits for soak window (separate follow-up PR).optionalDependenciesmigration for@ory/lumen-opencode— separate effort, requires distribution-packaging decisions.Verification
CI
Both fork PRs (Phase 1 mirror and this PR's mirror) green across the full matrix: Build (ubuntu / macos / windows-amd64 cross), Test, Lint, E2E, Integration (ubuntu / macos / windows), plus the existing polyglot Script tests as defensive soak coverage.
Local
Tested via
claude --plugin-dir <repo>on:MCP execution worked correctly on all three; the SessionStart hook ran cleanly, MCP
initializehandshake completed, and lumen tools were callable.Cold-cache lazy-fetch path
Manually verified by removing the developer binary and the existing cache, then running
node launcher.mjs prefetch. Launcher fetched the v0.0.39 release asset, SHA-256-verified againstchecksums.txt, and placed the binary at<cacheDir>/bin/0.0.39/.... Re-invocation was a sub-100 ms cache hit.Depends on
main. If test(integration): add native-spawn handshake gate #153 is merged via squash-and-merge, the resulting commit hash differs from0581c5band this PR will conflict onlauncher.mjs/internal/integration//.github/workflows/ci.yml— happy to rebase once that's landed; just flag your preferred merge style and I'll align.Fixes #152.
Summary by CodeRabbit
New Features
Documentation
CI
Tests