Add Qwen3-VL training adaptation for DFlash by gq112 · Pull Request #475 · sgl-project/SpecForge

gq112 · 2026-02-28T03:59:25Z

Motivation

This PR adds training adaptation support for Qwen3-VL in the DFlash framework. #461

The goal is to enable end-to-end DFlash draft model training and integration with Qwen3-VL target models under the SpecForge training pipeline.

Modifications

Added Qwen3-VL online DFlash training support.
Added multimodal position id handling for DFlash, including Qwen3-VL mRoPE alignment.
Enabled HF target-side hidden state and position id extraction for VLM inputs.
Updated target embedding / lm head loading for Qwen3-VL weight layouts.
Added qwen3-vl chat template, draft config, and example training script.

Accuracy Test

coming soon

Benchmark & Profiling

Checklist

Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.
Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.
Please feel free to join our Slack channel at https://sgl-fru7574.slack.com/archives/C09784E3EN6 to discuss your PR.

gemini-code-assist · 2026-02-28T03:59:29Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

ggg-s added 3 commits February 27, 2026 14:34

Add Qwen3-VL-8B DFlash training support

f80be86

Update train_dflash.py

c19b963

fix run_qwen3_vl_8b_dflash_online.sh

aff0de1

gq112 requested review from FlamingoPg, FrankLeeeee, shuaills, sleepcoo and zyksir as code owners February 28, 2026 03:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Qwen3-VL training adaptation for DFlash#475

Add Qwen3-VL training adaptation for DFlash#475
gq112 wants to merge 3 commits intosgl-project:mainfrom
gq112:dflash-qwen3-vl

gq112 commented Feb 28, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Feb 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gq112 commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Test

Benchmark & Profiling

Checklist

Uh oh!

gemini-code-assist bot commented Feb 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gq112 commented Feb 28, 2026 •

edited

Loading