Skip to content

[multimodal] add language_model_only flag for models like qwen3.5#1487

Merged
erictang000 merged 9 commits intoNovaSky-AI:mainfrom
erictang000:language_model_only
Apr 13, 2026
Merged

[multimodal] add language_model_only flag for models like qwen3.5#1487
erictang000 merged 9 commits intoNovaSky-AI:mainfrom
erictang000:language_model_only

Conversation

@erictang000
Copy link
Copy Markdown
Collaborator

@erictang000 erictang000 commented Apr 9, 2026

Add language_model_only flag for multimodal models (Qwen3.5)

Summary

  • Add language_model_only config flag across policy, ref, and inference engine configs to skip vision encoder initialization for multimodal models like Qwen3.5, reducing GPU memory usage
  • Fix FSDP weight sync: remap CausalLM param names (model.layers.*) to vLLM's expected namespace (language_model.model.layers.*) via new weight_prefix in FSDPWeightExtractor
  • Make FSDP wrap policy resilient to missing vision-only layer classes (warn + skip instead of crash)
  • Add flash-linear-attention and causal-conv1d dependencies; unblock causal-conv1d install override -- required for performant GDN layer execution
  • Add run_qwen3.5_0.8b.sh example with use_sample_packing=false (GDN layers are incompatible with packing)

Runs

FSDP and megatron reward matching
image

Test plan

  • Run run_qwen3.5_0.8b.sh on 4 GPUs -- verify weight sync, no GDN fallback warnings, avg_final_rewards trends up
  • Run existing non-multimodal FSDP test to confirm no regression
  • Verify config validation rejects mismatched language_model_only across policy/ref/generator

Open with Devin

gemini-code-assist[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 5 additional findings in Devin Review.

Open in Devin Review

devin-ai-integration[bot]

This comment was marked as resolved.

@erictang000
Copy link
Copy Markdown
Collaborator Author

cc: @nithinvc PR adding language_model_only flag - this shouldn't effect any of your runs since it's false by default but just heads up

@erictang000 erictang000 merged commit 5cf22c5 into NovaSky-AI:main Apr 13, 2026
5 of 6 checks passed
@erictang000 erictang000 deleted the language_model_only branch April 13, 2026 21:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant