Skip to content

Make the Windows process-container test harness capability-driven#547

Open
MGudgin wants to merge 1 commit into
mainfrom
user/gudge/win-container-tests-squashed
Open

Make the Windows process-container test harness capability-driven#547
MGudgin wants to merge 1 commit into
mainfrom
user/gudge/win-container-tests-squashed

Conversation

@MGudgin

@MGudgin MGudgin commented Jun 22, 2026

Copy link
Copy Markdown
Member

📖 Description

Reworks the Windows process-container test harness to be capability-driven:
it runs the correct checks on any Windows build (Nickel 23H2 / Germanium
24H2+25H2) by deriving the expected isolation tier and per-limit expectations
from wxc-exec --probe at runtime, instead of branching on the OS version.
Renames the harness accordingly, adds a working
JOB_OBJECT_UILIMIT_INJECTION probe, and makes the result output unambiguous.

Highlights:

  • Capability-driven harness + rename. Tier and per-limit expectations come
    from --probe; renamed Win25H2Safe-Tests.ps1WinProcessContainer-Tests.ps1
    and updated stale references (T3-Workloads.ps1, wxc main.rs,
    filesystem_dacl.rs).
  • BaseContainer (Tier 1) support. Tier-aware telemetry/assertions and a new
    probes.baseContainerSupportsDenyPaths fact (SANDBOX_CAP_DENY_PATHS) so
    deniedPaths tests auto-skip where the BaseContainer tier can't yet enforce
    them and auto-enable when the capability ships. Fixed a T1 New-Config
    $null-array-collapse crash under StrictMode.
  • HANDLES probe switched from GetWindowTextW to GetWindowThreadProcessId
    so it reads window-manager state directly and isn't confounded by UIPI or the
    target pumping messages.
  • INJECTION probe (JOB_OBJECT_UILIMIT_INJECTION, build 26100+) now creates
    and foregrounds its own top-level window before SendInput. The kernel's
    DoInputCheck evaluates the foreground-accessible check before the injection
    job-limit check and silently skips the input (returning success) when the
    foreground belongs to another inaccessible process, so a contained process on
    an interactive desktop would otherwise misread enforcement as "not enforced".
    Owning the foreground (the kernel permits injecting into one's own window)
    makes the limit actually evaluate. New INJECTION=INCONCLUSIVE outcome →
    harness SKIP when the foreground can't be owned, rather than a false verdict.
  • Result reporting. Record-Result gains skip/warn statuses
    ([SKIP]/[WARN], non-failing, yellow) so a not-applicable or not-enforced
    check is never shown as a green [PASS]; the summary and JSON carry
    skipped/warnings counts. Probe PASS/FAIL tokens are translated to semantic
    verbs in output (blocked/allowed for UI/atom probes, allowed/denied for the
    filesystem matrix) so a green [PASS] line never contains the word FAIL.
  • Dropped stale "T3 forced" labels from the Phase 4b/4c headers (those phases run
    on the host's baseline tier).

Note: This PR targets user/gudge/job_limit_test_fixes (the head of
#544). It should be rebased onto main and retargeted once #544 merges.

🔗 References

🔍 Validation

  • cargo fmt --all -- --check, cargo check --workspace --all-targets, and
    cargo clippy --workspace --all-targets -- -D warnings: all clean.
  • cargo test: appcontainer_common 126 passed, wxc_common 338 passed, 0
    failures.
  • Direct verification on build 26200: a process self-assigned to a job with
    JOB_OBJECT_UILIMIT_INJECTION is blocked by SendInput
    (gle=ERROR_ACCESS_DENIED), confirming the OS enforces the limit; the
    reworked probe run contained via wxc-exec reports INJECTION=PASS
    ("owns foreground", injected 0/1 gle=5) where the old probe falsely reported
    not-enforced.
  • Harness parse-checked via [Parser]::ParseFile; unit-tested the INJECTION
    verdict gate (enforced / not-enforced / inconclusive / build-gated) and the
    verdict-translation helpers (UI / filesystem / atom token sets, plus DIAG and
    <missing> pass-through).
  • Validated end-to-end on interactive 23H2 (22631), 25H2 (26200), and 25H2+
    (26634) machines — INJECTION correctly skips when the foreground is stolen
    mid-run and passes when left alone. Full Phase 4b/4c not run on the shared dev
    host (EXITWINDOWS carries a logoff risk).
Microsoft Reviewers: Open in CodeFlow

@MGudgin MGudgin requested a review from a team as a code owner June 22, 2026 13:45
Comment thread src/backends/appcontainer/common/src/base_container_runner.rs
Comment thread src/backends/appcontainer/common/src/base_container_runner.rs Outdated
Comment thread src/testing/wxc_ui_probe/src/main.rs
Comment thread src/testing/wxc_ui_probe/src/main.rs
jsidewhite
jsidewhite previously approved these changes Jun 23, 2026

@jsidewhite jsidewhite left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Base automatically changed from user/gudge/job_limit_test_fixes to main June 24, 2026 02:58
@MGudgin MGudgin dismissed jsidewhite’s stale review June 24, 2026 02:58

The base branch was changed.

Copilot AI review requested due to automatic review settings June 24, 2026 03:36
@MGudgin MGudgin force-pushed the user/gudge/win-container-tests-squashed branch from 8f13a24 to 54a50fc Compare June 24, 2026 03:36

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Reworks the Windows process-container test harness to be capability-driven by deriving tier/feature expectations from wxc-exec --probe at runtime, and extends the probing surface to cover BaseContainer deniedPaths capability plus a working UI INJECTION probe.

Changes:

  • Renames and refactors the Windows harness to compute expected tier/capabilities once (from --probe) and adapt assertions/phase applicability accordingly.
  • Updates wxc-ui-probe to use GetWindowThreadProcessId for HANDLES and adds the INJECTION probe that creates/foregrounds its own window before calling SendInput.
  • Extends wxc-exec --probe facts with baseContainerSupportsDenyPaths (derived from Experimental_QuerySandboxSupport) and updates call sites/comments.
Show a summary per file
File Description
tests/scripts/WinProcessContainer-Tests.ps1 Capability-driven harness refactor, tier-derived expectations, SKIP/WARN reporting, and INJECTION handling.
tests/scripts/T3-Workloads.ps1 Updates harness reference after rename.
src/testing/wxc_ui_probe/src/main.rs HANDLES probe update + new INJECTION probe implementation.
src/core/wxc/src/main.rs Updates comment reference to renamed harness.
src/core/wxc_common/src/filesystem_dacl.rs Updates comment reference to renamed harness.
src/backends/appcontainer/common/src/probe.rs Adds base_container_supports_deny_paths to --probe facts.
src/backends/appcontainer/common/src/base_container_runner.rs Adds SANDBOX_CAP_DENY_PATHS capability decode and probe helper.

Copilot's findings

Comments suppressed due to low confidence (1)

tests/scripts/WinProcessContainer-Tests.ps1:469

  • Get-HostCapabilities casts $p.tier to [string] before validating it. If the probe returns a detector error (tier is null), $tier becomes "" and later helpers like Test-SelectedTier can match trivially (regex escape of empty string), producing false PASS results. Fail fast when --probe omits tier (or returns an empty tier) and include the probe error/warnings in the exception so the harness can’t silently proceed with an unknown tier.
  • Files reviewed: 7/7 changed files
  • Comments generated: 1

Comment thread src/testing/wxc_ui_probe/src/main.rs
This PR reworks the Windows process-container test harness to run the correct
checks on any Windows build (Nickel 23H2 / Germanium 24H2+25H2) instead of
branching on OS version, renames it accordingly, and adds a working
JOB_OBJECT_UILIMIT_INJECTION probe plus clearer result reporting.

Details

* Capability-driven harness: derive the expected isolation tier and per-limit
  expectations from wxc-exec --probe at runtime rather than from the OS build
  number. Renamed Win25H2Safe-Tests.ps1 -> WinProcessContainer-Tests.ps1 and
  updated the stale doc/comment references (T3-Workloads.ps1, wxc main.rs,
  filesystem_dacl.rs).
* BaseContainer (Tier 1) support: tier-aware telemetry/assertions, a new
  probes.baseContainerSupportsDenyPaths fact (SANDBOX_CAP_DENY_PATHS) so
  deniedPaths tests auto-skip where the BaseContainer tier can't yet enforce
  them and auto-enable when the capability ships. Fixed a T1 New-Config
  $null-array-collapse crash under StrictMode.
* HANDLES probe switched from GetWindowTextW to GetWindowThreadProcessId so the
  check reads window-manager state directly and is not confounded by UIPI or
  the target pumping messages.
* INJECTION probe (JOB_OBJECT_UILIMIT_INJECTION, build 26100+): the probe now
  creates and foregrounds its OWN top-level window before SendInput. This is
  required because the kernel's DoInputCheck evaluates the foreground-accessible
  check before the injection job-limit check and silently skips the input
  (returning success) when the foreground belongs to another inaccessible
  process — so a contained process on an interactive desktop would otherwise
  read enforcement as "not enforced". Owning the foreground (the kernel allows
  injecting into one's own window) makes the limit actually evaluate. New
  INJECTION=INCONCLUSIVE outcome when the foreground can't be owned, mapped to a
  harness SKIP rather than a false verdict.
* Result reporting: Record-Result gains skip/warn statuses ([SKIP]/[WARN], both
  non-failing, yellow) so a not-applicable or not-enforced check is never shown
  as a green [PASS]; the summary and JSON carry skipped/warnings counts. Probe
  PASS/FAIL tokens are translated to semantic verbs in output (blocked/allowed
  for UI/atom probes, allowed/denied for the filesystem matrix) so a green
  [PASS] line never contains the word FAIL.
* Dropped stale "T3 forced" labels from the Phase 4b/4c headers (those phases
  run on the host's baseline tier, not a forced T3).

Tests

* cargo build/clippy (-D warnings)/fmt --check on wxc_ui_probe and wxc: clean
  (debug + release).
* Direct verification on build 26200: a process self-assigned to a job with
  JOB_OBJECT_UILIMIT_INJECTION is blocked by SendInput (gle=ACCESS_DENIED),
  confirming the OS enforces the limit; the reworked probe run contained via
  wxc-exec reports INJECTION=PASS ("owns foreground", injected 0/1 gle=5) where
  the old probe falsely reported not-enforced.
* Harness parse-checked via [Parser]::ParseFile; unit-tested the INJECTION
  verdict gate (enforced / not-enforced / inconclusive / build-gated) and the
  verdict-translation helpers (UI / filesystem / atom token sets, plus DIAG and
  <missing> pass-through).
* Full Phase 4b/4c not run on the shared dev host (EXITWINDOWS carries a logoff
  risk); validated end-to-end by the author on interactive 23H2 (22631), 25H2
  (26200) and 25H2+ (26634) machines — INJECTION correctly skips when the
  foreground is stolen mid-run and passes when left alone.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@MGudgin MGudgin force-pushed the user/gudge/win-container-tests-squashed branch from 54a50fc to 89c104d Compare June 24, 2026 03:44

@jsidewhite jsidewhite left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants