[WIP][DO NOT MERGE] XNNPACK BYOC backend for Relax CPU inference by mshr-h · Pull Request #19580 · apache/tvm

mshr-h · 2026-05-17T07:52:01Z

Just experimenting.

Benchmark on NVIDIA DGX Spark:

Model	Partitions	Baseline mean ms	XNNPACK mean ms	Speedup
`xnnpack_tiny_cnn`	4	0.002500	0.003049	0.820x
`xnnpack_static_qs8_tiny_cnn`	2	0.002587	0.001711	1.512x
`xnnpack_large_cnn_fp32`	5	0.412256	0.044388	9.288x
`xnnpack_large_mlp_fp32`	4	0.148401	0.018079	8.209x
`xnnpack_large_qs8_cnn`	2	0.003716	0.002309	1.609x
`torchvision:mobilenet_v2`	17	90.415	91.823	0.985x
`torchvision:mobilenet_v3_small`	62	16.420	17.165	0.957x
`torchvision:resnet18`	9	845.077	845.838	0.999x

gemini-code-assist

Code Review

This pull request integrates XNNPACK as a Relax BYOC backend, enabling support for static-shape float32 and quantized CNN subgraphs. The changes encompass CMake build support, Relax pattern registration with a cost-based partitioning policy, TFLite frontend updates for QDQ models, and a JSON-based external codegen. Reviewers identified multiple typos in the Python implementation where tvm.tirx was incorrectly used instead of tvm.tir, and suggested replacing a hardcoded float literal in the C++ codegen with std::numeric_limits<float>::max() for improved robustness.

gemini-code-assist · 2026-05-17T07:56:49Z

+
+    shape = []
+    for dim in sinfo.shape.values:
+        if not isinstance(dim, (tvm.tirx.expr.IntImm, int)):


The submodule tvm.tirx does not exist in the standard TVM library. This appears to be a typo for tvm.tir.

Suggested change

if not isinstance(dim, (tvm.tirx.expr.IntImm, int)):

if not isinstance(dim, (tvm.tir.IntImm, int)):

gemini-code-assist · 2026-05-17T07:56:50Z

+
+
+def _symbol_name(dim) -> str | None:
+    if isinstance(dim, (tvm.tirx.expr.IntImm, int)):


The submodule tvm.tirx does not exist in the standard TVM library. This appears to be a typo for tvm.tir.

Suggested change

if isinstance(dim, (tvm.tirx.expr.IntImm, int)):

if isinstance(dim, (tvm.tir.IntImm, int)):

gemini-code-assist · 2026-05-17T07:56:50Z

+    result: list[int | None] = []
+    symbol: str | None = None
+    for index, dim in enumerate(dims):
+        if isinstance(dim, (tvm.tirx.expr.IntImm, int)):


The submodule tvm.tirx does not exist in the standard TVM library. This appears to be a typo for tvm.tir.

Suggested change

if isinstance(dim, (tvm.tirx.expr.IntImm, int)):

if isinstance(dim, (tvm.tir.IntImm, int)):

gemini-code-assist · 2026-05-17T07:56:50Z

+    if not isinstance(expr, relax.PrimValue):
+        return None
+    value = expr.value
+    if isinstance(value, tvm.tirx.expr.FloatImm):


The submodule tvm.tirx does not exist in the standard TVM library. This appears to be a typo for tvm.tir.

Suggested change

if isinstance(value, tvm.tirx.expr.FloatImm):

if isinstance(value, tvm.tir.FloatImm):

gemini-code-assist · 2026-05-17T07:56:50Z

+    value = expr.value
+    if isinstance(value, tvm.tirx.expr.FloatImm):
+        return float(value.value)
+    if isinstance(value, tvm.tirx.expr.IntImm):


The submodule tvm.tirx does not exist in the standard TVM library. This appears to be a typo for tvm.tir.

Suggested change

if isinstance(value, tvm.tirx.expr.IntImm):

if isinstance(value, tvm.tir.IntImm):

gemini-code-assist · 2026-05-17T07:56:50Z

+  }
+
+ private:
+  static constexpr double kXNNPACKInfinity = 3.4028234663852886e38;


Using a hardcoded literal for the maximum float value is less robust than using standard library constants. Consider using std::numeric_limits<float>::max().

Suggested change

static constexpr double kXNNPACKInfinity = 3.4028234663852886e38;

static constexpr double kXNNPACKInfinity = std::numeric_limits<float>::max();

mshr-h added 17 commits May 17, 2026 16:28

Add XNNPACK BYOC skeleton

f1c166e

Add minimal XNNPACK ReLU BYOC pipeline

c3ad896

Add minimal CNN BYOC operator support

e0cd433

Add end-to-end validation and benchmark script

fa646af

Harden runtime options and capability handling

917bda9

Add runtime FP16 precision policy options

3b819ef

Add partition cost policy and decision report

bab5c59

Add XNNPACK quantization metadata plumbing

30436ac

Add QS8 weighted op plumbing

866b1ed

Add signed-int8 TFLite QDQ import plumbing

cb45e36

Add QS8 QDQ island ops for pool/reshape/add

812286c

Add deployment hardening and benchmark matrix support

36214a4

Add opt-in dynamic-range fully-connected path

ea8bac9

Add opt-in dynamic batch support for Relax BYOC

1e25a89

Harden JSON runtime validation and size checks

ca88094

Add fp32 MLP GELU and softmax BYOC coverage

d0a18df

Add larger benchmark fixtures and reporting

a4ce7c1

gemini-code-assist Bot reviewed May 17, 2026

View reviewed changes

Prune unstable BYOC paths and add typed configs

e6446d4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP][DO NOT MERGE] XNNPACK BYOC backend for Relax CPU inference#19580

[WIP][DO NOT MERGE] XNNPACK BYOC backend for Relax CPU inference#19580
mshr-h wants to merge 18 commits into
apache:mainfrom
mshr-h:xnnpack-byoc

mshr-h commented May 17, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 17, 2026

Uh oh!

gemini-code-assist Bot May 17, 2026

Uh oh!

gemini-code-assist Bot May 17, 2026

Uh oh!

gemini-code-assist Bot May 17, 2026

Uh oh!

gemini-code-assist Bot May 17, 2026

Uh oh!

gemini-code-assist Bot May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	if not isinstance(dim, (tvm.tirx.expr.IntImm, int)):
	if not isinstance(dim, (tvm.tir.IntImm, int)):



		def _symbol_name(dim) -> str \| None:
		if isinstance(dim, (tvm.tirx.expr.IntImm, int)):

	if isinstance(dim, (tvm.tirx.expr.IntImm, int)):
	if isinstance(dim, (tvm.tir.IntImm, int)):

	if isinstance(value, tvm.tirx.expr.FloatImm):
	if isinstance(value, tvm.tir.FloatImm):

	if isinstance(value, tvm.tirx.expr.IntImm):
	if isinstance(value, tvm.tir.IntImm):

	static constexpr double kXNNPACKInfinity = 3.4028234663852886e38;
	static constexpr double kXNNPACKInfinity = std::numeric_limits<float>::max();

Conversation

mshr-h commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mshr-h commented May 17, 2026 •

edited

Loading