feat(keypoint-detection): enable ViTPose config/build/perf#905
Conversation
ViTPose keypoint-detection models could not pass the wmk pipeline:
1. Task resolution: Optimum registers the ViTPose ONNX export config but has
no task-to-class entry for keypoint-detection, and transformers'
AutoModelForKeypointDetection only recognizes SuperPoint. Add
MODEL_CLASS_MAPPING[(vitpose, keypoint-detection)] = VitPoseForPoseEstimation
(models/hf/vitpose.py) so the resolver loads the correct class.
2. MoE export: the vitpose-plus checkpoints use a Mixture-of-Experts backbone
whose patcher injects a constant dataset_index at export time. Optimum's
patch_model_for_export defaults model_kwargs to None, so the patcher crashed
on init. Pass an explicit model_kwargs={} in _get_optimum_patcher. Also wrap
the Step 3 hierarchy trace in the same patcher context (it previously ran
the model forward without the injected dataset_index, failing before export).
Verified config -> build -> perf on all 6 acceptance models in #284
(base-simple, plus-{small,base,large,huge}, synthpose-vitpose-huge-hf).
2007caf to
540ab05
Compare
| # are traced with the same inputs they are exported with. The export | ||
| # in Step 4 re-enters the patcher; the contexts are sequential, not | ||
| # nested. | ||
| with self._get_optimum_patcher(model, task): |
There was a problem hiding this comment.
Change looks good and low-risk overall. The model_kwargs={} fix is effectively a no-op for non-MoE patchers (Optimum's ModelPatcher.__init__ already coerces None → {}), so that one's safe.
My one concern is wrapping Step 3 (_trace_model_hierarchy) in the patcher. Models that already resolve a real Optimum patcher today — e.g. CLIP (clip_text_model / clip_vision_model), SAM, T5, SigLIP, whisper, VED — will now have their hierarchy trace run through patched forward for the first time. The ONNX graph is unaffected (export already ran patched), but the traced module path can shift, which could change the hierarchy / tag coverage on those models.
You verified the 6 ViTPose models, but those weren't being traced-under-patch before. Could you also run a before/after on at least one already-patched non-ViTPose model (CLIP is a good pick) and confirm the tag coverage / hierarchy stats are unchanged?
|
|
||
| # (model_type, task) -> HuggingFace model class | ||
| MODEL_CLASS_MAPPING: dict[tuple[str, str], type] = { | ||
| ("vitpose", "keypoint-detection"): VitPoseForPoseEstimation, |
There was a problem hiding this comment.
| ("vitpose", "keypoint-detection"): VitPoseForPoseEstimation, | |
| ("vitpose", "keypoint-detection"): VitPoseForPoseEstimation, | |
| ("vitpose", None): VitPoseForPoseEstimation, |
Could you help try test to see this line makes command with task omitted work?
Enables the 6 ViTPose keypoint-detection models from #284 to pass
wmk config -> build -> perf(CPU and OpenVINO). Eval isn't included here - I'll do the accuracy side in a follow-up since it needs a couple of design decisions first.Two things were blocking all 6 models:
wmk configfailed with "Task 'keypoint-detection' not supported by TasksManager". Optimum has the ViTPose ONNX export config but no task->class entry for keypoint-detection, andAutoModelForKeypointDetectiononly covers SuperPoint. Added the(vitpose, keypoint-detection) -> VitPoseForPoseEstimationmapping, same way we already do it for CLIP/SAM.The plus checkpoints (MoE backbone) crashed during export with "dataset_index must be provided when using multiple experts". Optimum's VitPoseModelPatcher injects a constant dataset_index, but
patch_model_for_exportdefaults model_kwargs to None so it crashed on init. Passing an explicitmodel_kwargs={}fixes that. The trace step (Step 3) was also running the model outside the patcher context, so I wrapped it the same way the export step already is.The exporter change isn't ViTPose-specific - it helps any MoE model whose patcher injects forward args.
Verified config/build/perf on all 6: vitpose-base-simple, vitpose-plus-small/base/large/huge, and synthpose-vitpose-huge-hf. Added unit tests for the mapping and the patcher model_kwargs handling.
One note: you still need to pass
--task keypoint-detectionexplicitly for now - the task isn't auto-detected from the config yet. I left auto-detection out of this PR to keep it small; can add it here or as a follow-up if you'd prefer.Refs #284.