feat: mxc-sdk lib crate#524
Conversation
Add a new `core/mxc` library crate — the Rust analogue of the SDK's `spawnSandboxFromConfig` with `usePty: false`. It parses the same JSON config the executor binaries consume, selects the host containment backend, runs the sandboxed process without ever allocating a pty, and returns captured stdout/stderr in a `ScriptResponse`. Supported backends: Bubblewrap (Linux), Seatbelt (macOS), and ProcessContainer — AppContainer plus the BaseContainer fallback — (Windows). Other backends return `MxcError::unsupported_containment`. Output capture is gated behind the new `ExecutionRequest::capture_output` flag (default `false`, preserving the binaries' streaming behaviour; the library forces it `true`): - Seatbelt gains a non-pty captured exec path (mirrors bubblewrap). - AppContainer/BaseContainer gain a capture path using CreatePipe + reader threads, reusing the existing `process_util` helpers. - Bubblewrap already captures. Backend selection is centralised in `mxc::dispatch::select_runner`, and the `wxc`, `lxc`, and `mxc_darwin` binaries now delegate their shared backend arms to it so the selection logic has a single home. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
Add a streaming, handle-based spawn API alongside the run-to-completion entrypoint, so importers can drive a sandboxed process while it runs: persistent bidirectional stdio plus termination. No pty is allocated; the streams are ordinary pipes. - `wxc_common::sandbox_process`: new `SandboxProcess` handle trait (`take_stdin`/`take_stdout`/`take_stderr` -> boxed Write/Read, `try_wait`, `kill`, `wait`) and the `StreamingRunner` spawn trait. Interfaces live in `wxc_common`; impls live in the backend crates. - Seatbelt: `SeatbeltSandboxProcess` + `StreamingRunner` impl — spawns the sandboxed shell with piped stdio, `kill()` does graceful SIGTERM then SIGKILL after a grace period, and `wait()` drains any stream the caller did not take into the `ScriptResponse`. - `mxc::spawn_sandbox` + `dispatch::spawn_runner` return the handle (cfg-split; Seatbelt implemented, other backends report `unsupported_containment` for now). Re-exports `SandboxProcess`. Verified on macOS (bidirectional stdio against `cat`, kill of `sleep`, and untaken-stream capture via `wait`). Linux/Windows streaming impls are follow-ups. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
Two Seatbelt backend bugs: - Non-zero exits reported `failurePhase: None`. The runner never set `failure_phase`, so every result defaulted to `None`. Add an `exit_response` helper that tags `ProcessExited` on a non-zero exit (and `None` on success), and route all run-completed paths (exec/pty, captured, open, and the streaming `wait()`) through it. - With no `working_directory` set, the child inherited the host cwd; under the deny-by-default profile that directory (or its parents) can be unreadable, so the child's `getcwd()` walked parent dirs and leaked "getcwd: ... Operation not permitted" to stderr. Default the cwd to a sandbox-allowed path (first readwrite, else readonly, else `/`) and export a matching `PWD` so `getcwd()` uses its fast stat path instead of enumerating unreadable parents. Adds regression tests for both (failure phase on non-zero/zero exit and no getcwd leak when no working directory is set). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
Implement the handle-based streaming API (live stdio + kill) for the Windows AppContainer backend, the counterpart to the Seatbelt support. - Add Win32 `PipeReader`/`PipeWriter` adapters in `process_util` so the child's pipe ends can be exposed as `Read`/`Write`. - Refactor `AppContainerScriptRunner` so the monolithic run path is split into a shared `spawn_suspended` (setup + CreateProcess, suspended) and a `SpawnedChild` that either runs to completion (existing blocking path, behaviour preserved) or is wrapped in a streaming handle. Per-run firewall/BFS policy setup/teardown is factored into `prepare`/`teardown` so both paths share it. - Add `AppContainerSandboxProcess` (impl `SandboxProcess`): take_stdin/ stdout/stderr, try_wait, kill (TerminateProcess), and wait (drains untaken streams, runs teardown, adds exit diagnostics). Tears down firewall/BFS policy on wait or drop. - Wire `mxc::dispatch::spawn_runner` for ProcessContainer on Windows (AppContainer fast path). The BaseContainer fallback returns `unsupported_containment` for now (streaming pending). Cross-checked for x86_64-pc-windows-msvc; the in-process Windows path is not runtime-verified here and needs Windows CI. macOS run path and tests unaffected. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
Complete Windows streaming by adding the handle-based path for the BaseContainer backend (the OS sandbox API used for experimental / newer-schema configs), the common case for the library on modern schemas. - Split BaseContainerRunner::execute() into a shared `spawn_base` (proxy/spec/identity setup + CreateProcessInSandbox, returning a `BaseChild`) and a `run_to_completion` for the blocking path; behaviour of the run path is preserved. - `BaseChild` owns the process handle, parent-side pipes, and the per-run proxy/sandbox-tracking state, and performs `run_sandbox_cleanup` + proxy stop + exit diagnostics after the child exits. - Add `BaseContainerSandboxProcess` (impl `SandboxProcess`): live stdio, kill (TerminateProcess), try_wait, and wait (drains untaken streams, tears down sandbox/proxy, shapes the response). Drop runs teardown. - `mxc::dispatch::spawn_runner` now selects AppContainer vs BaseContainer for ProcessContainer streaming, mirroring the run-to-completion choice. Cross-checked for x86_64-pc-windows-msvc; like the AppContainer path the in-process Windows behaviour (and resource teardown ordering) is not runtime-verified here and needs Windows CI. Linux/Bubblewrap streaming remains a follow-up. macOS tests unaffected. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
Before backend selection, `select_runner` and `spawn_runner` now check the host OS and return a clear message on platforms with no MXC backend: "the mxc library has no sandbox backend for this host OS (supported: Windows, Linux, macOS)". Previously an unsupported host (e.g. FreeBSD) fell through to the default ProcessContainer arm and surfaced the misleading "...only available on Windows" error. Supported hosts are unaffected. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
Callers cancelling/timing out a sandboxed command need to take its whole process tree down (pipelines, `&`, servers), not just the root. Add the pid and make kill() tree-kill on every streaming backend: - `SandboxProcess::id() -> u32` exposes the child's OS process id for external monitoring or a caller-driven tree kill. - Seatbelt: spawn the streaming child with `setsid()` so it leads its own process group; kill() now signals the whole group (negative-pid SIGTERM then SIGKILL), which is safe even if setsid failed (a negative pid only targets that pid's group, never the host's). - Windows AppContainer: kill() terminates the job object (new `UiJobObject::terminate`) instead of just the root process. - Windows BaseContainer: best-effort-assign the child to a job object after creation so kill() can terminate the tree (falls back to the root process if assignment fails). Adds a macOS test that backgrounds a descendant and asserts kill() reaps it. Windows paths cross-checked only (need Windows CI). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
Add the handle-based streaming API (live stdio, kill, id) for the Bubblewrap (Linux) backend, completing streaming across every backend the library supports. - Split BubblewrapScriptRunner::execute() into a shared `spawn_bwrap` (proxy + iptables setup + bwrap spawn, returning a `BwrapChild`) and a `run_to_completion` for the blocking path; run-path behaviour preserved. - Streaming spawns bwrap with piped stdin and `process_group(0)` so the handle can tree-kill it: since bwrap is PID 1 of the sandbox pid namespace (`--unshare-pid`), a `killpg` on its group tears the whole sandbox down. `BubblewrapSandboxProcess` (impl `SandboxProcess`) owns the network proxy/iptables state and tears it down on wait or drop. - Wire `mxc::dispatch::spawn_runner` for the Bubblewrap arm on Linux. Cross-checked + clippy-clean for aarch64-unknown-linux-gnu (via the stable toolchain, since the pinned toolchain's x86_64-linux std is unavailable here); runtime still needs Linux CI with bwrap. macOS run path/tests unaffected. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
The PermissionDenied-vs-generic `Command::spawn` failure message was copy-pasted across the GUI, captured, and streaming exec paths. Extract a private `spawn_error(&io::Error) -> String` helper; behaviour unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
Port the 5 `@microsoft/mxc-sdk` config-building helpers to Rust so callers no longer need the TypeScript module to build a spawnable config. - `mxc::policy`: `SandboxPolicy` (mirrors the SDK type) and `build_request` — the port of `createConfigFromPolicy`, restricted to the backends the crate runs (`Containment::Process` resolves to Seatbelt / Bubblewrap / ProcessContainer per host; `Containment::Bubblewrap` forces Bubblewrap). It mirrors the SDK field mapping and network proxy / host-filtering validation, building the same wire config internally and running it through the shared parser (with allow_missing_command, since the caller fills script_code), so validation and the wire->model mapping match production. Output is an `ExecutionRequest`. - `mxc::policy` discovery helpers `available_tools_policy`, `user_profile_policy`, `temporary_files_policy` (ports of policy.ts) — PATH + tool/SDK env dirs, user-profile dirs, and the host temp dir. - `mxc::platform::platform_support` — port of `getPlatformSupport`; on Windows it reads the isolation tier / UI capabilities from the in-process fallback probe instead of shelling out to `wxc-exec --probe`. - Add `spawn_streaming_from_request` to round out the request-based API. - Derive `Clone` on the probe's `UiCapabilitySupport` so it can be carried on the cloneable `PlatformSupport`. Verified on macOS (9 new tests incl. policy -> build_request -> real Seatbelt run, plus the existing suite; clippy + fmt clean). Cross-checked for x86_64-pc-windows-msvc and aarch64-unknown-linux-gnu; their runtime behaviour still needs platform CI. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
…ignment `spawn_base` wrapped `pi.hProcess` in a temporary `OwnedHandle` just to pass it to `job.assign_process(...)`. `OwnedHandle::get()` returns a copy (HANDLE is Copy) and the temporary's `Drop` then closed `pi.hProcess` — so the handle stored on `BaseChild` pointed at a closed (and possibly reused) handle. That broke every BaseContainer run/stream: `WaitForSingleObject`/`GetExitCodeProcess` on a dead handle (garbage/-1 exit codes), wait/terminate potentially acting on an unrelated reused handle, and a double-close on drop. `assign_process` only borrows the handle (it calls `AssignProcessToJobObject` and neither stores nor closes it), so pass the raw `pi.hProcess` and keep sole ownership on the `BaseChild.process` field. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
Both timeout arms (run-to-completion SpawnedChild::wait_exit and the streaming AppContainerSandboxProcess::wait) terminated only the direct child. The job has no KILL_ON_JOB_CLOSE, so descendants survived holding the inherited stdout/stderr write-ends open — the capture reader threads then blocked in read_to_string forever, join() never returned, and the timeout was silently not enforced (the exact runaway case it exists for), leaving orphaned processes behind. Terminate the job object (as kill() already does) in both WAIT_TIMEOUT arms so the whole tree dies and the pipe write-ends are released. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
spawn_process_container claimed it "mirrors the run-to-completion selection", but the streaming path does not route through dispatch_with_fallback: there is no AppContainer-BFS / AppContainer-DACL fallback for streaming (that dispatcher returns a run-to-completion runner plus a DaclManager guard, neither of which fits a streaming handle). The behaviour is fail-closed and safe — no containment weakening and the BaseContainer tier applies no host DACLs — but a config that runs via spawn_sandbox_from_config on a host lacking the native BaseContainer API would fail via spawn_sandbox. Replace the inaccurate comment with the real contract and document the divergence in the crate README. No behaviour change. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
SeatbeltSandboxProcess::wait()'s timeout branch called self.child.kill(), which signals only the direct child and orphans sandboxed descendants — even though kill() already does the correct setsid + killpg(SIGTERM->SIGKILL) group kill. Reuse self.kill() in the timeout branch so the whole process group dies (and the captured pipe write-ends are released), then reap the child. Adds a macOS test that backgrounds a descendant and asserts it is killed when the streaming wait times out. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
… on Drop Dropping a streaming handle while the sandboxed process was still running ran the teardown (iptables/proxy on Bubblewrap; firewall/BFS on AppContainer; proxy/sandbox state on BaseContainer) but never terminated the child. An abandoned-but-running sandbox would then lose its host enforcement (firewall mode -> unrestricted egress) and leak as a zombie/orphan. Drop now terminates and reaps the child (reusing each handle's kill()) before removing enforcement, across all three streaming backends. The post-wait Drop path stays a no-op (try_wait early-return + teardown_done guard). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
ClipboardPolicy::wire() emitted "readOnly"/"readWrite", but the config parser only accepts "none"/"read"/"write"/"all" — so every non-None clipboard policy built via build_request was silently dropped to None (fail-closed but wrong). The enum was also missing write-only and read+write(all) levels. Redefine ClipboardPolicy as None/Read/Write/All with wire strings none/read/write/all, matching the SDK type and the parser. Adds a regression test asserting all four levels survive build_request -> ExecutionRequest. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
…r works available_tools_policy resolved paths via Path::canonicalize, which on Windows yields a \\?\ verbatim prefix (and resolves symlinks). is_system_critical_path compares against C:\Windows, so the prefixed form never matched and System32 (on PATH) was not filtered out of readonly_paths — weakening the protective filter. canonicalize also hits the filesystem, unlike the SDK's lexical path.resolve. Replace canonicalize with lexical absolutization (join with cwd if relative, then collapse ./.. via components, preserving the prefix/root). This matches path.resolve, keeps the plain C:\... form, and touches no filesystem. Adds a test that a system-critical dir on PATH is filtered. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
The deadlock-avoidance rule was unstated, so every backend author re-derived it. Document on the SandboxProcess trait that stdout and stderr are independent bounded pipes and must be consumed concurrently: wait() implementors must drain not-taken streams on separate threads, and callers taking both streams must read them concurrently (taking only one is always safe). No code change. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
- dispatch: support LXC on the run-to-completion path (select_runner); reject dry_run on the streaming path (no process to stream) - dispatch: restore wxc-exec diagnostic parity for the BaseContainer fallback (the "Using BaseContainer-fallback dispatcher (reason)" line and the "warning: " prefix) inside select_process_container - lib: correct the env-injection doc (replaces, vs the SDK's append), and the spawn_sandbox doc (dry_run is rejected, not ignored); drop the logger buffer from error strings (may contain config/env); remove the dead capture_output assignment on the streaming entrypoint - policy: read SystemDrive from the process environment (SDK parity); drop the logger buffer from the build_request error string - sandbox_process: clarify id() is only meaningful while the child is alive (Unix PID reuse after reap) - process_util: add SAFETY comments to the ReadFile/WriteFile blocks - README: fix proc.wait() (returns ScriptResponse, not Result) and mark the live spawning examples no_run - lxc: drop the now-unused bwrap_common dependency Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
…hrough Add the missing coverage called out in review: - run path: dry_run validates without executing; finite scriptTimeout fires; working_directory override; env override replaces a config value; is_base64 input; stderr-only capture; version-out-of-range and malformed JSON rejection - streaming: take_* second call returns None; try_wait reports the exit code after completion - sdk helpers: build_request maps allowed/blocked hosts and local-network; platform_support assertions gated for Linux and Windows - Windows ProcessContainer integration tests (AppContainer + BaseContainer capture, AppContainer finite-timeout-with-descendant, streaming stdio) as the regression guards for #1 (closed process handle) and microsoft#2 (timeout only reaching the direct child). They run a real sandbox, so they require an elevated, host-prepped Windows host and are #[ignore]d. Also split the macOS-only test imports behind cfg so the test crate compiles cleanly when cross-targeting Windows and Linux. Verified: cargo test -p mxc (macOS, 37 run + 4 ignored Windows), fmt, clippy -D warnings; cargo check -p mxc --tests cross-targeted to Windows and Linux. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds a new in-process mxc Rust library plus a cross-backend streaming (“handle-based”) sandbox API, while refactoring executor binaries to share backend-selection logic and introducing an opt-in output-capture switch in ExecutionRequest.
Changes:
- Introduce
wxc_common::sandbox_processtraits (SandboxProcess,StreamingRunner) for streaming sandbox execution. - Add
ExecutionRequest::capture_outputand implement capture/streaming across Seatbelt, Bubblewrap, and Windows ProcessContainer (AppContainer/BaseContainer). - Add new
mxclibrary crate (dispatch, policy builder, platform support) and updatewxc/lxc/mxc_darwinbinaries to reuse shared dispatch.
Show a summary per file
| File | Description |
|---|---|
| src/core/wxc_common/src/sandbox_process.rs | Defines streaming handle traits and deadlock contract for concurrent drain. |
| src/core/wxc_common/src/process_util.rs | Adds Windows pipe wrapper types implementing Read/Write over Win32 handles. |
| src/core/wxc_common/src/models.rs | Adds ExecutionRequest::capture_output flag. |
| src/core/wxc_common/src/lib.rs | Exposes the new sandbox_process module. |
| src/core/wxc_common/src/config_parser.rs | Sets default capture_output: false when building requests from config. |
| src/core/wxc/src/main.rs | Uses mxc::select_runner for ProcessContainer backend selection/guarding. |
| src/core/wxc/Cargo.toml | Adds dependency on the new mxc crate. |
| src/core/mxc_darwin/src/main.rs | Uses mxc::select_runner for shared backend selection. |
| src/core/mxc_darwin/Cargo.toml | Switches to mxc dependency instead of direct Seatbelt dependency. |
| src/core/mxc/tests/streaming.rs | Adds streaming API tests (stdio, kill, wait capture). |
| src/core/mxc/tests/sdk_helpers.rs | Adds tests for policy helpers, platform support, and request builder. |
| src/core/mxc/tests/sandbox.rs | Adds end-to-end mxc library tests (config parsing, capture, timeout, Windows ignores). |
| src/core/mxc/src/policy.rs | Implements SDK-like policy discovery and SandboxPolicy→ExecutionRequest builder. |
| src/core/mxc/src/platform.rs | Implements host platform support detection (SDK-like). |
| src/core/mxc/src/lib.rs | Provides public mxc library API: spawn (capture) + spawn (streaming) + helpers. |
| src/core/mxc/src/dispatch.rs | Centralizes backend runner selection and streaming spawn for library/binaries. |
| src/core/mxc/README.md | Documents mxc usage, policy builder, and streaming API. |
| src/core/mxc/Cargo.toml | Defines the new mxc crate and OS-specific backend deps. |
| src/core/lxc/src/main.rs | Routes Bubblewrap selection through mxc::select_runner. |
| src/core/lxc/Cargo.toml | Adds mxc dependency and removes direct bwrap_common dep. |
| src/backends/seatbelt/common/src/seatbelt_runner.rs | Adds capture-mode execution, fixes cwd behavior, implements StreamingRunner + SandboxProcess. |
| src/backends/bubblewrap/common/src/bwrap_runner.rs | Refactors into spawnable child + adds StreamingRunner + SandboxProcess. |
| src/backends/appcontainer/common/src/probe.rs | Makes UiCapabilitySupport clonable for reuse in mxc::platform. |
| src/backends/appcontainer/common/src/job_object.rs | Adds UiJobObject::terminate helper for process-tree kill. |
| src/backends/appcontainer/common/src/base_container_runner.rs | Refactors to spawnable child + adds capture pipes and StreamingRunner + SandboxProcess. |
| src/backends/appcontainer/common/src/appcontainer_runner.rs | Refactors to suspended spawn + adds capture/streaming handle + shared prepare/teardown. |
| src/Cargo.toml | Adds core/mxc to workspace and workspace dependency entries. |
| .github/copilot-instructions.md | Documents new mxc crate and shared-dispatch architecture. |
Copilot's findings
- Files reviewed: 28/29 changed files
- Comments generated: 6
# Conflicts: # src/backends/appcontainer/common/src/base_container_runner.rs
LXC was never a good fit for the in-process library: the LXC backend has no non-pty capture path (attach_run always runs via run_with_pty and returns empty stdout/stderr while streaming to the host terminal), so the run path violated the crate's "capture output, no pty" contract, and it had no streaming/SandboxProcess impl at all. Rather than half-support it, the crate now rejects LXC with unsupported_containment like the other unsupported backends; Bubblewrap remains the fully-supported Linux backend (captured run + streaming + tree-kill). The lxc-exec binary keeps its native LXC support. - dispatch: remove the ContainmentBackend::Lxc arm and select_lxc - Cargo: drop the now-unused lxc_common dependency - platform_support: report only bubblewrap on Linux (no lxc-ls probe) - tests: assert Linux reports bubblewrap only Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
Resolve the six inline review comments on microsoft#524: - base_container streaming wait(): drop an untaken stdin before draining, and on timeout tree-terminate via the job object (reuse kill()) instead of only the root process, so descendants release the captured pipe write-ends and the drain threads can finish. - base_container capture wait_exit(): tree-terminate via the job object on timeout (fall back to the root only when no job exists), same deadlock fix for the non-streaming capture path. - appcontainer streaming wait(): drop an untaken stdin so interactive children observe EOF and exit reliably. - bubblewrap streaming wait(): on timeout use the process-group terminate (reuse kill()) instead of a single-process signal, matching the SandboxProcess contract. - dispatch: return the BaseContainer-fallback diagnostics (fallback notice, dispatcher warnings, selected tier) via Selection.warnings instead of writing them into a logger the library drops; select_runner no longer takes a logger and callers surface the warnings themselves. - policy normalize_lexically(): never pop past a root/prefix on "..", so "/a/../../b" stays "/b" and "C:\\.." stays "C:\\" (keeps the system-critical-path filter and dedup correct). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
The shared config parser treats an empty schema version as "unset" and
accepts it, but the SDK requires a version ("Policy version is required").
build_request now rejects an empty `SandboxPolicy.version` with
MalformedRequest for parity. Added a test.
This was the only still-valid item from the latest review pass; the other
findings (LXC dispatch contract, bwrap streaming timeout group-kill, dead
Selection.warnings field, LXC doc drift) were already resolved by the
earlier LXC removal and review-feedback commits.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
The externally-tagged derive only accepted the bare string
"builtinTestServer" for the unit variant, but the SDK wire shape is
`{ "builtinTestServer": true }`. Replace the derive with a custom
Deserialize over an untagged intermediate so all three proxy forms
(`{builtinTestServer:true}` / `{localhost:n}` / `{url:s}`) round-trip.
Serialize (proxy_to_wire) was already correct. (review O8)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
- pid_alive: use signal-0 `libc::kill(pid, 0)` instead of spawning `ps`, removing the PID-reuse race and no longer treating a probe failure as "dead" (only ESRCH means gone; EPERM etc. means alive). Adds a macOS dev-dependency on libc. - available_tools_policy_filters_nonexistent_and_dedups: assert the full resolved cwd is discovered rather than the near-tautological basename substring. (review O11) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
The capture drains were unsafe for large or binary output: - Unix `read_to_string` was uncapped (OOM on unbounded child output) and failed on the first invalid UTF-8 byte, discarding the whole stream. - Windows `read_from_pipe` stopped draining once it hit the 1 MB char cap, so a child emitting more would block forever on a full pipe (and with the library forcing capture_output + INFINITE default timeout, the call hung). It also decoded per 4 KB chunk, corrupting multibyte UTF-8 at boundaries. Add a shared `wxc_common::capture_io::read_capped_lossy` that reads to EOF (discarding past a 1 MiB byte cap so the child never stalls) and decodes once with from_utf8_lossy. Both Unix runners delegate to it; `read_from_pipe` is rewritten to mirror the behaviour over its HANDLE. (review O1, O6, O7) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
std::process::Command inherits the parent env by default, and the env was only cleared when request.env was non-empty. The mxc library emits an empty process.env by default, so untrusted sandboxed code received the embedding host process's entire environment (cloud creds, API tokens) and could read and exfiltrate it. Always env_clear() and set only a baseline PATH plus the request's vars, matching bubblewrap (--clearenv) and AppContainer. Applied to both the run and streaming paths via a shared helper; adds a regression test. (review O2) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
| // returning and `assign_process` completing could still escape the job. | ||
| // In practice the child is a shell that has not yet run the user | ||
| // command, so that window is empty; fully closing it would require a | ||
| // create-suspended path verified on a host-prepped build. |
There was a problem hiding this comment.
This (creating SUSPENDED) should work (at least from looking at the OS code). Can we at least amend the code above to create suspended and then resume after the job add (where the resume will be a no-op if CPIS is not doing the right thing).
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
The Unix streaming handles' kill() group path called group_kill() unconditionally, while the Drop comment claimed kill() was idempotent via a try_wait guard the group path did not have. After wait()/try_wait() reaped the child, a later kill() (e.g. from Drop) would SIGKILL the child's now-stale pid and pgid -- which the OS may have recycled onto an unrelated process group. Guard kill() at the top with try_wait() for both the group and single-process paths in the Seatbelt and Bubblewrap handles: a reaped std::process::Child returns its cached exit status here without a syscall, so kill()/Drop after reap is a clean no-op. Add a macOS regression test that waits a child to completion, then asserts kill() stays Ok without re-signaling. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
env_get compared env-var names case-sensitively, but Windows environment variable names are case-insensitive -- and the TS SDK this port mirrors relies on Node's case-insensitive process.env on Windows. A caller that passes a custom env (or a host whose canonical casing differs from the hard-coded lookup names) could silently drop USERPROFILE / PATH / TEMP-derived filesystem grants. Match case-insensitively on Windows; Unix names stay case-sensitive. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
The SandboxProcess::kill() trait doc and the mxc-sdk README described the
Unix kill as "graceful SIGTERM, escalating to SIGKILL after a grace
period", but the implementation (group_kill) sends an immediate SIGKILL
with no SIGTERM -- as its own doc comment states ("No graceful SIGTERM
first"). Align the docs with the implementation.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
The README table and lib.rs called the Windows backend "AppContainer + BaseContainer fallback" and claimed the crate reuses appcontainer_common::dispatcher::dispatch_with_fallback. The streaming path does neither: dispatch.rs selects BaseContainer (experimental / newer-schema) or AppContainer directly and fails closed with no BFS/DACL fallback, as the README's own "Windows note" already states. Drop the "fallback" labels and the false dispatch_with_fallback reuse claim. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
build_request hard-coded targets_host_filtering_backend = cfg!(linux), so it rejected allowedHosts/blockedHosts without allowOutbound on macOS, diverging from the TS SDK. The SDK's resolvesToHostFilteringBackend (sdk/src/sandbox.ts) treats Seatbelt (and process-on-darwin) as a host-filtering backend and accepts those rules without allowOutbound; only Windows ProcessContainer requires it. Match the SDK: treat Linux and macOS as host-filtering backends so the two ports stay reconciled. Seatbelt still can't actually enforce hostnames (it degrades to allow-all outbound), but cross-implementation consistency matters more than being stricter here. Windows behavior is unchanged. Updated the macOS unit test and the sdk_helpers integration test that previously asserted the macOS rejection. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
…ment The v1 policy spec said allowedHosts/blockedHosts "Error if allowOutbound is not set", but host-filtering backends (Linux, macOS) accept them without allowOutbound; only Windows ProcessContainer requires it (matching the SDK and the mxc-sdk port after fe7dd08). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
set_script takes the raw command string and maps it to script_code, exactly like the SDK's spawnSandbox(script) / process.commandLine -- the same backend then wraps it (/bin/sh -c on Unix, command line on Windows). The platform-dependent semantics is the deliberate, SDK-consistent contract, not a divergence; this documents that parity (review CR-22). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
wait_with_timeout polled every fixed 50ms, so a quick child paid up to a full 50ms of exit-detection latency on every short run. Start the poll at 1ms and double up to a 50ms cap: a fast child is now detected within ~a millisecond, while a long run still settles to the cheap 50ms cadence. Timeout semantics are unchanged -- each sleep is still clamped to the time remaining, so the deadline fires exactly as before. Adds unit tests for prompt quick-exit detection, deadline firing, and the no-timeout path (review CR-08). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
Broaden the kill-after-reap coverage beyond the single wait-then-kill case: add streaming_kill_after_try_wait_reap_is_a_noop (the exact race -- try_wait() reaps the child, a later kill() must not signal the recycled pid/pgid) and streaming_double_kill_before_wait_completes_promptly (kill() twice while running stays Ok and wait() then completes promptly, not hanging). Exercises the kill() reaped-guard from 019bb04 (review CR-20). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
Sandbox::wait returned io::Result<i32> and encoded a timeout as an io::Error with ErrorKind::TimedOut, so callers could not cleanly tell a timeout from a genuine OS/wait failure. Introduce a public WaitOutcome enum (Exited(i32) | TimedOut) and return io::Result<WaitOutcome>: an ordinary exit and a timeout are both Ok outcomes, and Err is now reserved for an actual OS/wait failure (review CR-23). BREAKING CHANGE: Sandbox::wait now returns io::Result<WaitOutcome> instead of io::Result<i32>; match on WaitOutcome::Exited(code) / TimedOut. The inner wxc_common SandboxProcess::wait contract is unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
|
Superseded by a 3-PR stack that splits this change into reviewable, compile-ordered layers (per the proportionality feedback in review):
Review in order (#554 → #555 → #556); #555/#556 are stacked, so their diffs shrink to just their own layer once the base PR merges. Closing this in favor of the stack. |
|
(will keep it open for now) |
…fely
Taking both take_stdout() and take_stderr() and reading them sequentially
can deadlock an output-heavy child (one pipe fills while the reader is
blocked on the other). Add wait_with_output(): it consumes the handle,
drains stdout and stderr concurrently on separate threads, and returns
Output { outcome, stdout, stderr } -- the safe, convenient default,
mirroring std::process::Child::wait_with_output. Review CR-24.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
…ent/host The streaming dispatch in `dispatch.rs` only had direct tests for the `dry_run` and macOS `guiAccess` rejection branches. Add two more so the remaining guardrails are exercised in CI: - `streaming_rejects_unsupported_containment`: drives the internal model with `containment = Lxc` and asserts `UnsupportedContainment` plus the backend name in the message. - `host_support_ok_on_supported_platforms`: cfg-gated to Windows / Linux / macOS, guards against the `ensure_host_supported` cfg list dropping a supported platform. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
…set_env
set_env(Vec<String>) made the caller hand-format raw KEY=VALUE strings,
which diverged from the SDK's env channel -- injectEnvIntoConfig
(sdk/src/sandbox.ts) takes a structured { key: value } map and joins it to
the KEY=VALUE wire form internally.
Accept (key, value) pairs instead and do the formatting in the setter, so
the crate matches the SDK surface and callers can't forget the '='. The wire
representation (Vec<String> of KEY=VALUE) is unchanged, and iteration order
is preserved so a later duplicate key still wins downstream -- same as the
SDK. No eager validation is added: the SDK doesn't validate either, and
structured input already removes the malformed-entry foot-gun.
Adds a unit test asserting the pair-to-KEY=VALUE ordered mapping.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
The self-pipe used to wake `InterruptibleReader`'s `poll` was created with `libc::pipe`, which does not set close-on-exec. Both wake fds would then leak into any process the thread later forks+execs (e.g. another sandbox child) and keep the wake pipe alive unexpectedly. Mark both ends `FD_CLOEXEC` after `pipe()`, mirroring the fixup `mxc_pty` already does for PTY fds. The data pipe is unaffected -- Rust already sets CLOEXEC on `Child` stdio. Addresses a Copilot review comment on microsoft#555. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
`group_kill` discarded both `kill(2)` results and always returned `Ok(())`, so `SandboxProcess::kill()` reported success even when it never signalled the process group. Route both signals through a `send_sigkill` helper that returns the error instead, treating only the "already gone" outcomes as success: `ESRCH`, and `EPERM` -- which on macOS a redundant kill of an exited-but-unreaped child's group reports in place of `ESRCH` (observed via the double-kill test). The caller guards with `try_wait()` first, so the pid/pgid can't be recycled and `EPERM` here can only be that benign race, never a real permission failure. Addresses a Copilot review comment on microsoft#556. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
… file `tests/streaming.rs` is `#![cfg(target_os = "macos")]`, so the `#[cfg(target_os = "windows")]` ProcessContainer streaming test it contained could never compile -- the intended Windows coverage was silently missing. Move that test into its own `tests/streaming_processcontainer.rs`, gated `#![cfg(target_os = "windows")]`, so it actually builds on Windows. Addresses a Copilot review comment on microsoft#556. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
📖 Description
Adds
mxc-sdk(library namemxc_sdk), an importable Rust library crate for starting sandboxes in-process, without a pty. Callers can stream sandboxed processes without shelling out to the executor binaries or depending on the@microsoft/mxc-sdkTypeScript module.Supported backends: Bubblewrap (Linux), Seatbelt (macOS), and Windows ProcessContainer (AppContainer + BaseContainer fallback).
The crate's public surface is decoupled from the internal execution model:
build_request/spawn_sandboxtake and return the crate's own types (SandboxRequest,Sandbox, andError/ErrorCode) rather than the internalwxc_commontypes they map to, so the internals can be refactored without breaking callers.To share one execution path between the new library (no-pty, caller-owns-stdio) and the existing executor binaries (run-to-completion), the three library backends were unified on a single
SandboxBackendtrait. That part is a refactor of the shared backend layer — not purely additive; see "Unified backend execution model" below.What's included
spawn_sandboxtakes aSandboxRequest(built bybuild_request, see below) and returns aSandboxhandle for persistent bidirectional stdio (take_stdin/take_stdout/take_stderr),try_wait(),id(),kill(), andwait(). No pty is allocated;wait()drains and discards any untaken stdout/stderr (so the child can't block on a full pipe) and returns the exit code asio::Result<i32>(ErrorKind::TimedOuton timeout). Streaming is implemented for Seatbelt, Bubblewrap, and Windows ProcessContainer. The pipe-deadlock contract (drain stdout/stderr concurrently) is documented on the handle.SandboxBackendtrait (validate+spawn(request, logger, StdioMode) -> Box<dyn SandboxProcess>+ adiagnose_exithook that preserves AppContainer launch diagnostics).StdioMode::Pipesis what the library uses (the caller takes stdin/stdout/stderr);StdioMode::Inheritis what the executor binaries use (the child inherits the binary's own stdio, preserving the TTY when launched under a pty). A single genericRunner<B>adapter inwxc_common::sandbox_processprovides the run-to-completionScriptRunnerthe binaries dispatch on, by callingspawn(StdioMode::Inherit)thenwait()— so there is no longer any per-backend run-to-completion logic and the oldStreamingRunnertrait is removed. Thewxc-exec/lxc-exec/mxc-exec-macbinaries wrap their backends inRunnerand otherwise behave identically (LXC keeps its own native pty path).kill()and timeouts take down the whole tree (Unix process-group signal with an unconditionalSIGKILLsweep so a racily-forked descendant can't survive; Windows job-object terminate), so descendants can't keep inherited pipe write-ends open and hangwait(). A timeout surfaces uniformly aswait()returningio::ErrorKind::TimedOutacross all backends.env_clear()s, matching Bubblewrap's--clearenvand the AppContainer clean block), so an embedding host app's secrets aren't exposed to untrusted code.mxc_sdk::policy(SandboxPolicy,build_request, plusavailable_tools_policy/user_profile_policy/temporary_files_policydiscovery) andmxc_sdk::platform_support, so callers no longer need the TypeScript SDK for policy construction or platform discovery.build_requestmaps aSandboxPolicyto aSandboxRequest; set the command withset_script(...), and optionallyset_working_directory(...)/set_env(...), before spawning. On macOS the request also exposes the Seatbelt knobs a host may need —seatbelt_extra_mach_lookups()/set_seatbelt_extra_mach_lookups(...)/set_seatbelt_keychain_access(...).Limitations
unsupported_containment. Thelxc-execbinary keeps its native LXC support.dry_runis rejected by the streaming spawn (there is no process to stream).docs/host-prep.md) and are#[ignore]d.🔭 Follow-ups (separate PRs)
mxcengine crate — the crate is namedmxc-sdkbecause a follow-up will introduce a lower-levelmxccrate that bothmxc-sdkand the existing executor binaries call into. Until then, the in-crate backend dispatch (dispatch.rs/platform.rs) inmxc-sdkis provisional and moves into that engine crate.The unified
SandboxBackend/Runnermodel can be extended to the remaining backends incrementally; the cost is set by each backend's I/O model, not by trait plumbing:SandboxProcessimpl that would also bring the backend into themxc-sdklibrary (Linux-via-LXC; capable Windows hosts).ScriptRunneruntil a concrete need arises.platform_supportUI / isolation reporting — the SDKgetPlatformSupportparity fields (isolation tier, tier-degradation warnings, and the Windows UI-capability probe results) were trimmed fromPlatformSupportsince no consumer reads them yet. Restore them (re-wiring the AppContainer probe) when a caller needs richer host-capability discovery.🔗 References
Follow-up to the TypeScript SDK's one-shot spawn surface; this is its in-process Rust counterpart.
🔍 Validation
cargo test -p mxc-sdkon macOS — 41 tests pass (12 lib-unit,sandbox7,sdk_helpers10,streaming11, plus 1 doc-test); the Windows-host ProcessContainer integration tests are#[ignore]d (they spawn a real sandbox and need an elevated, host-prepped Windows host).cargo fmt --all -- --checkandcargo clippy --all-targets -- -D warningsclean (macOS + Linux + Windows cross-check).cargo checkagainstx86_64-pc-windows-msvcandaarch64-unknown-linux-gnu, plus the per-backend crates on each target.guiAccessGUI path,launchMethod: open(Terminal), and timeout tree-kill.0.7.0-alpha) streams stdout and propagates a non-zero exit code (exit /b 7→EXIT=7); a 2000 msscriptTimeoutfires and tree-kills a process tree blocked on a modal dialog (beforestreamed,afternever ran). The direct-AppContainer/BFS path needsbfscfg.exe, absent on the test host, so it was covered by compile checks only.io::ErrorKind::TimedOutfromwait()), theguiAccessstreaming rejection,dry_runrejection, version-out-of-range / empty-version rejection, thebuild_requestfilesystem/network/clipboard/proxy wire mapping, and the macOS Seatbeltextra_mach_lookups/keychain_accessround-trip.✅ Checklist
📋 Issue Type