Skip to content

fix(build): honor --ep; require explicit device+ep in resolve_device; support cross-compile#947

Merged
xieofxie merged 5 commits into
mainfrom
hualxie/build_forward_ep
Jun 24, 2026
Merged

fix(build): honor --ep; require explicit device+ep in resolve_device; support cross-compile#947
xieofxie merged 5 commits into
mainfrom
hualxie/build_forward_ep

Conversation

@xieofxie

@xieofxie xieofxie commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Summary

Several related EP/device-handling fixes in winml build, found after winml build --ep openvino --device npu --compile silently ignored --ep.

1. winml build now honors --ep

The command passed only device/precision to generate_build_config — never ep — so the requested EP was silently dropped (and, with an explicit absent --device, it failed for the wrong reason). Now ep is forwarded.

2. resolve_device requires explicit device + ep

Both are now required (no device="auto" default; ep was already required), so an omission is a call error instead of a silent ep=None / device="auto". Several callers relied on the old defaults (latent TypeErrors + 8 failing test_device.py tests) — fixed to pass both explicitly:

  • analyzer auto-device resolution
  • build's auto-EP (ep is None) block
  • eval device resolution
  • compile-stage device resolution

This closes the class of bug behind #931.

3. Cross-compile support via resolve_check_device_ep

winml build now resolves through resolve_check_device_ep, which validates an explicit device+ep statically (no availability cross-check). A build can target a device that isn't on the build machine; the outcome depends on whether compilation is actually needed:

Scenario Result
--no-compile, target device absent ✅ builds — portable, analyzed/quantized ONNX (cross-compile)
--compile, EP has no EPContext (e.g. MIGraphX, DML) ✅ builds — compile is a no-op
--compile, EP has EPContext, device absent (e.g. OpenVINO+NPU) ⚠️ fails at the compile stage (late) — tracked in #950

Verification

Empirical builds of microsoft/resnet-50 on a machine without an NPU confirm all three rows (exit 0 / 0 / 1, failing only at the compile stage for row 3). Unit suites green: tests/unit/{commands,sysinfo,compiler,analyze,config} (2500+ passed); test_build.py + test_device.py updated for the new resolver contract.

Follow-up

Related: #931 (fixed in #941).

…ilability

`winml build` dropped --ep when auto-generating a config — it passed only
device/precision to generate_build_config, so the requested EP never reached
config generation. With an explicit --device that isn't present, config gen then
resolved the device with ep=None and failed immediately with a device-availability
error, silently ignoring --ep
(e.g. `winml build -m microsoft/resnet-50 --ep openvino --device npu --compile`).

- Forward `ep` into generate_build_config so the EP shapes the generated config.
- Add an early, compile-gated EP-availability check after config validation: a
  compile build physically instantiates the EP, so an unavailable EP/device now
  fails fast (before export) instead of deep in the compile stage. A no-compile
  build only produces a portable, analyzed ONNX and may target a device absent on
  the build machine (cross-compile), so it is allowed to proceed.
- Stop the optimize-stage device resolution from availability-failing: it only
  needs a concrete device name for the analyzer's rule-data lookup, so it no
  longer passes ep or rejects an absent explicit device.

Verified end-to-end: `--compile --device npu --ep openvino` (no NPU) now fails
before export with the EP respected; `--no-compile` proceeds to export the
portable model.

Related to the EP-threading family (#931).
@xieofxie xieofxie requested a review from a team as a code owner June 23, 2026 08:34
@xieofxie xieofxie marked this pull request as draft June 23, 2026 09:18
resolve_device(device, *, ep) now takes both device and ep as required args, so
an omission is a call error instead of a silent ep=None / device="auto". Several
callers still relied on the old defaults (latent TypeErrors) — pass ep=None
explicitly where no EP filter is intended:

- analyzer: auto-device resolution for rule-file selection
- build: the ep-is-None auto-EP resolution
- eval: device resolution
- compile stage: device resolution (EP comes from ep_config)
- test_device.py: resolve_device call sites

Also keeps the build command forwarding --ep to generate_build_config, and drops
the compile-availability gate / optimize-stage tweak from earlier on this branch
— requiring explicit device+ep is the cleaner, centralized enforcement.
@xieofxie xieofxie changed the title fix(build): forward --ep to config generation; gate compile builds on EP availability fix(build): forward --ep; require explicit device + ep in resolve_device Jun 23, 2026
build resolved device/EP via resolve_device(), which availability-cross-checks:
an explicit device absent on the machine (e.g. --device npu on a non-NPU box)
raised even for --no-compile — which only produces a portable, analyzed ONNX and
never needs the device present.

Switch both call sites to resolve_check_device_ep(): with an explicit device+ep
it validates statically (no availability probe), so a cross-target --no-compile
build now runs export -> optimize -> analyze -> quantize to completion. The
auto-select (ep is None) block still availability-checks (resolve_check_device_ep
routes to resolve_device when ep is None) and now gets device + available EPs
from a single call.

Verified: `winml build -m microsoft/resnet-50 --ep openvino --device npu
--no-compile` (no NPU present) completes end-to-end.
@xieofxie xieofxie changed the title fix(build): forward --ep; require explicit device + ep in resolve_device fix(build): honor --ep; require explicit device+ep in resolve_device; support cross-compile Jun 24, 2026
Comment thread src/winml/modelkit/compiler/stages/compile.py Outdated
Comment thread src/winml/modelkit/commands/eval.py Outdated
@xieofxie xieofxie marked this pull request as ready for review June 24, 2026 06:28

@timenick timenick left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid, well-scoped fix. The root cause (build dropping --ep into generate_build_config) is addressed and the surrounding cleanup is consistent: making device + ep required on resolve_device turns the silent ep=None / device="auto" defaults into call errors, and every caller that relied on those defaults is updated to pass both explicitly. I checked all resolve_device / resolve_check_device_ep call sites at this head — they all forward ep, so there’s no latent TypeError from a missed caller. The auto-EP block’s available_eps[0] is safe since resolve_device only returns a device with ≥ 1 available EP (else it raises), and routing build through resolve_check_device_ep for the static device+ep validation is the right call for the cross-compile scenarios. Nice regression test for the --ep forwarding. LGTM.

🤖 Generated with GitHub Copilot CLI

@xieofxie xieofxie merged commit 6c7bef2 into main Jun 24, 2026
9 checks passed
@xieofxie xieofxie deleted the hualxie/build_forward_ep branch June 24, 2026 06:37

@DingmaomaoBJTU DingmaomaoBJTU left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this is a well-motivated three-part fix: (1) forward --ep into generate_build_config, (2) make both params required in resolve_device/resolve_check_device_ep to turn silent bugs into call errors, (3) route the optimize stage through resolve_check_device_ep so cross-target --no-compile builds work. All 138 tests pass.

General: In the static path of resolve_check_device_ep (explicit device + explicit ep), the second tuple element is list(EP_SUPPORTED_DEVICES[ep_name]) -- the EP's statically declared supported devices. In the dynamic path it returns runtime-filtered devices. All callers use _ so no active bug, but future callers will see inconsistent semantics. Consider documenting or unifying the two paths.


def resolve_check_device_ep(
*, device: str = "auto", ep: EPNameOrAlias | None = None
*, device: str, ep: EPNameOrAlias | None

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring says 'Ideal for commands that do not need the device + ep actually exists on the system.' But static validation (no availability probe) only applies when BOTH device is explicit (not auto) AND ep is non-None. When either is absent the function delegates to resolve_device which does a full availability cross-check. A caller passing --device npu without --ep on a machine without NPU will still get a ValueError. The docstring should clarify this for users relying on this for cross-compile.


try:
resolved_device, _ = _resolve_device(device=device)
resolved_device, _, available_eps = resolve_check_device_ep(device=device, ep=ep)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: inside the 'if ep is None:' block, passing ep=ep (which is provably None here) is technically correct but slightly opaque. Writing ep=None explicitly would remove any ambiguity.

"""--ep forwarding into config generation + the compile EP-availability gate."""

def _base_args(self, cfg: str, tmp_path: Path) -> list[str]:
return ["-c", cfg, "-m", "microsoft/resnet-50", "-o", str(tmp_path / "out")]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_base_args is never called by any test in this class -- dead helper. If intended for future tests add a TODO comment; otherwise remove it. The args it builds also mix -c and -m which would not match the auto-config path the class is testing.

["-m", "microsoft/resnet-50", "--ep", "openvino", "-o", str(tmp_path / "out")]
)
assert result.exit_code == 0, result.output
assert mock_gen.call_args.kwargs["ep"] == "openvino"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing regression test for the cross-compile static path: the PR's headline feature (--device npu --ep openvino --no-compile completing on a machine without NPU) is verified empirically but has no unit test. A test that mocks resolve_check_device_ep to return a static result and confirms the optimize stage proceeds would guard against future regressions (e.g. accidentally restoring resolve_device in the optimize path).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants