Skip to content

Support v1 raw multimodal image offload#2836

Draft
eligotts wants to merge 3 commits into
feat/nano-as-v1from
feat/v1-raw-mm-offload
Draft

Support v1 raw multimodal image offload#2836
eligotts wants to merge 3 commits into
feat/nano-as-v1from
feat/v1-raw-mm-offload

Conversation

@eligotts

@eligotts eligotts commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Summary

Wires Prime-RL v1 training onto the generic raw multimodal image offload contract used by the companion Verifiers and Renderers PRs.

Companion PRs:

What changed:

  • Adds a prime_rl.multimodal adapter layer and strict raw multimodal descriptor schema for adapter-owned payloads.
  • Teaches /inference/v1/generate to consume renderer mmraw refs as generic RawMMRef objects, build RawMMItem from family/fingerprint/payload, and delegate materialization to the registered adapter.
  • Converts v1 traces into trainer-bound mm_refs carrying strict prime_raw_mm_item envelopes plus local file:// image URIs; legacy eager mm_kwargs/processed payloads are rejected.
  • Materializes raw image refs in the trainer with the trainer model's HF image processor and adapter-specific validation.
  • Keeps multimodal samples compatible with packing by carrying raw refs and image token type ids through TrainingSample and packed MicroBatch.
  • Keeps trainer.missing_mm_image_policy, defaulting to placeholder_zero_loss, with placeholder synthesis delegated to the owning multimodal adapter.
  • Adds adapter coverage for Qwen-VL and Kimi K2.5, including Qwen processor SizeDict.get(...) handling.

Validation

  • Pre-commit during commit: ruff check and ruff format passed.
  • End-to-end hosted-style smoke using the intended local stack:
    OUT=/tmp/prime-rl-v1-raw-mm-offload-clean-api-smoke-20260620-072559 PYTHONPATH=/home/ubuntu/prime-rl-v1-raw-mm-offload/src:/home/ubuntu/renderers:/home/ubuntu/verifiers scripts/smoke/mini_browse_platform_v1_raw_offload.sh
  • Smoke result: inference started, Prime env sandboxes ran, orchestrator collected 4/4 trainable mini-browse rollouts, trainer completed step 0, and decoded trainer-bound artifacts had 4 TrainingBatch examples and 4 packed MicroBatches with mm_refs, 0 eager mm_kwargs, 12 strict prime_raw_mm_item descriptors, 12 local file:// image URIs, 0 data URIs, 0 missing files, and image token type ids present.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant