Skip to content

fix(#4): fail closed when background forward has no listener#6

Open
ghost wants to merge 1 commit into
mainfrom
agent/4-forward-listener-probe
Open

fix(#4): fail closed when background forward has no listener#6
ghost wants to merge 1 commit into
mainfrom
agent/4-forward-listener-probe

Conversation

@ghost

@ghost ghost commented Jun 17, 2026

Copy link
Copy Markdown

sandbox_forward() previously returned Ok(()) in background mode even when find_ssh_forward_pid() could not discover the forked SSH process and no local TCP listener was reachable on the forwarded port. This left users with a reported forward that could not be stopped by OpenShell and might not be serving traffic.

Changes:

  • Return an error when PID discovery fails in background mode instead
    of printing a warning and succeeding.
  • Add probe_listener() in openshell-core::forward that attempts a TCP
    connect with retries to confirm the local listener is reachable
    before writing the PID file.
  • If the listener probe fails but a PID was found, kill the orphaned
    SSH process and return an error with a descriptive message.
  • Add kill_process() helper for best-effort PID cleanup.
  • Add integration test reproducing the issue with the fake SSH binary.
  • Update existing sandbox_create_keeps_sandbox_with_forwarding test
    to expect the forward to fail closed (sandbox is still kept).

Note: Rust toolchain was not available in the sandbox environment. Unit and integration tests could not be run locally. Manual verification of the test suite is required.


Closes #4

Post-script verification

  • Branch is not main/master (agent/4-forward-listener-probe)
  • Secret scan passed (gitleaks — 09bd32f35a3b539e6aec895d4dec87d4ec030369..HEAD)
  • Pre-commit hooks passed (authoritative run on runner)
  • Tests ran inside sandbox

sandbox_forward() previously returned Ok(()) in background mode even
when find_ssh_forward_pid() could not discover the forked SSH process
and no local TCP listener was reachable on the forwarded port. This
left users with a reported forward that could not be stopped by
OpenShell and might not be serving traffic.

Changes:
- Return an error when PID discovery fails in background mode instead
  of printing a warning and succeeding.
- Add probe_listener() in openshell-core::forward that attempts a TCP
  connect with retries to confirm the local listener is reachable
  before writing the PID file.
- If the listener probe fails but a PID was found, kill the orphaned
  SSH process and return an error with a descriptive message.
- Add kill_process() helper for best-effort PID cleanup.
- Add integration test reproducing the issue with the fake SSH binary.
- Update existing sandbox_create_keeps_sandbox_with_forwarding test
  to expect the forward to fail closed (sandbox is still kept).

Note: Rust toolchain was not available in the sandbox environment.
Unit and integration tests could not be run locally. Manual
verification of the test suite is required.

Closes #4

Signed-off-by: fullsend-code <289857995+devtest-coder[bot]@users.noreply.github.com>
@ghost

ghost commented Jun 17, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 8:56 PM UTC · Completed 9:07 PM UTC
Commit: 3462faa · View workflow run →

@ghost

ghost commented Jun 17, 2026

Copy link
Copy Markdown

Review

Findings

Medium

  • [fail-open] crates/openshell-core/src/forward.rs:462kill_process() accepts an arbitrary u32 PID and sends SIGTERM without verifying the target process is still the expected SSH forward. The existing stop_forward() function validates ownership via pid_matches_forward() before killing (line 163), but the new kill_process() call path in ssh.rs bypasses this validation entirely. A TOCTOU race between find_ssh_forward_pid() and kill_process() could terminate an unrelated process if the PID is recycled.
    Remediation: Call pid_matches_forward(pid, port, Some(&session.sandbox_id)) before kill_process(pid), mirroring the safety check in stop_forward().

  • [blocking-in-async] crates/openshell-core/src/forward.rs:443probe_listener() uses std::thread::sleep() in a synchronous function called directly from the async sandbox_forward(). This blocks the tokio runtime thread for up to 1 second (10 × 100ms). Other blocking operations in this codebase (e.g., exec_or_wait at ssh.rs:264) use tokio::task::spawn_blocking().
    Remediation: Either make probe_listener async with tokio::time::sleep, or wrap the call site with tokio::task::spawn_blocking().

Low

  • [data-exposure] crates/openshell-core/src/forward.rs:449probe_listener() connects to a user-supplied bind_addr:port from ForwardSpec without validating the address against a known-safe set. Impact is limited (local CLI tool, TCP SYN only), but worth noting.

  • [error-message-style] crates/openshell-cli/src/ssh.rs — New error messages use capitalized first letters ("Could not discover...", "SSH process (pid...") while the established miette::miette!() pattern in this file consistently uses lowercase ("sandbox not found", "ssh exited with status", "no command provided").

  • [function-visibility] crates/openshell-core/src/forward.rskill_process is pub because it is called cross-crate, which is correct. However, its generic name and lack of ownership validation make it a footgun as a public API. Consider a more specific name (e.g., kill_forward_process) or encapsulating the validate-then-kill pattern.

Info

  • [fail-open] crates/openshell-cli/src/ssh.rs:371 — The change from warn-and-continue to return-error is a correct fail-closed improvement that addresses the core issue.

  • [inconsistent-fail-closed] crates/openshell-tui/src/lib.rs:1559 — The TUI forward flow still silently ignores PID discovery failure (the pre-existing behavior this PR fixes in the CLI). Not a regression introduced by this PR, but the new probe_listener/kill_process APIs are not consumed by the TUI.

  • [error-context] crates/openshell-cli/src/ssh.rs:371 — The error message "may not be running" is adequate given that SSH exited successfully but pgrep found no matching background process.

  • [code-organization] crates/openshell-core/src/forward.rs — New functions are well-placed under their own // Listener probe section divider between lsof_listeners and the SSH utility section.

@ghost ghost added the requires-manual-review Review requires human judgment label Jun 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

requires-manual-review Review requires human judgment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sandbox forward reports success without a reachable local listener

0 participants