[train][multimodal][1/3] Add vision support to generate() in new inference stack by nithinvc · Pull Request #1494 · NovaSky-AI/SkyRL

nithinvc · 2026-04-10T20:07:27Z

Summary

1/3 PRs for for #1493 - multi-turn VLM generator

Adds multimodal generation support to the inference client, enabling RemoteInferenceClient.generate() to forward multi-modal features (image hashes, placeholder ranges, kwargs) from vLLM's render endpoint through to the generation endpoint.

Thread mm_features through RemoteInferenceClient.generate() and _generate_single(), conditionally attaching them as "features" in the HTTP payload to the vLLM server
Add mm_processor_cache_gb=0 to vLLM CLI args to disable the multimodal processor cache. Required otherwise vLLM won't return multi-modal features for repeated image rendering (/render is not idempotent).
Add unit test verifying mm_features are forwarded in the HTTP payload via mock server
Add GPU integration test (test_generate_with_multimodal_features_red_square) that exercises the full render -> generate round-trip with a VLM

Test plan

Existing test_remote_inference_client.py tests pass: uv run pytest tests/backends/skyrl_train/inference_servers/test_remote_inference_client.py -v
New TestMultiModalGeneration test passes: verifies mm_features reach the server payload
GPU integration test passes (requires local vLLM): SKYRL_LOCAL_VLLM=1 uv run --isolated --extra dev --extra fsdp pytest tests/backends/skyrl_train/gpu/gpu_ci/inference_servers/test_vlm_inference_generation.py -m vllm -v

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

SumanthRH · 2026-04-13T21:16:52Z

SKYRL_LOCAL_VLLM=1 uv run --isolated --extra dev --extra fsdp pytest tests/backends/skyrl_train/gpu/gpu_ci/inference_servers/test_vlm_inference_generation.py -m vllm -v

What is SKYRL_LOCAL_VLLM @nithinvc? Tests should pass with vllm 0.19.0

nithinvc · 2026-04-13T21:19:52Z

A temporary flag till vllm-project/vllm#38405 gets merged in. It's been approved but the auto-merger hasn't merged it in (there's some test timeout unrelated to my changes). The tests will likely only pass with vllm 0.20.0 when this is in

SumanthRH · 2026-04-13T21:22:53Z

Ok sounds good. let's get this in anyways

inference client changes

bc3d433

This comment was marked as resolved.

Sign in to view

devin-ai-integration bot reviewed Apr 10, 2026

View reviewed changes

nithinvc mentioned this pull request Apr 10, 2026

[train][multimodal][3/3] Add multi-turn VLM generator #1486

Merged

2 tasks

gemini suggestions + update test

6cee94d

nithinvc changed the title ~~[train][multimodal][1/2] Add vision support to generate() in new inference stack~~ [train][multimodal][1/3] Add vision support to generate() in new inference stack Apr 11, 2026

SumanthRH approved these changes Apr 13, 2026

View reviewed changes

SumanthRH merged commit 66d401a into NovaSky-AI:main Apr 13, 2026
5 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[train][multimodal][1/3] Add vision support to generate() in new inference stack#1494

[train][multimodal][1/3] Add vision support to generate() in new inference stack#1494
SumanthRH merged 2 commits intoNovaSky-AI:mainfrom
nithinvc:nithinc/train-vlm-generate

nithinvc commented Apr 10, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

SumanthRH commented Apr 13, 2026

Uh oh!

nithinvc commented Apr 13, 2026

Uh oh!

SumanthRH commented Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nithinvc commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

SumanthRH commented Apr 13, 2026

Uh oh!

nithinvc commented Apr 13, 2026

Uh oh!

SumanthRH commented Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nithinvc commented Apr 10, 2026 •

edited

Loading