Skip to content

Investigate: Qwen3.5-9B GRPO — how does Unsloth actually support it? #15

@akashgit

Description

@akashgit

Background

Unsloth advertises Qwen3.5 support, but when we tried `unsloth/Qwen3.5-9B`:

  • It loads as `Qwen3_5ForConditionalGeneration` (a vision-language model)
  • Crashes during GRPO generation with:
    ```
    RuntimeError: The size of tensor a (16) must match the size of tensor b (0)
    at non-singleton dimension 1
    ```
    in `compute_3d_position_ids` — a multimodal position encoding function

We switched to Qwen3-8B (pure text CausalLM) which works fine. But Qwen3.5's hybrid Gated DeltaNet architecture (Mamba+Transformer) is interesting and may have better efficiency.

Questions

  1. Does Unsloth's Qwen3.5 support require specific model variants (e.g., text-only vs multimodal)?
  2. Is there a `Qwen3.5-9B-Base` or similar text-only variant?
  3. Does Qwen3.5 GRPO need special generation config (disable 3D position IDs for text-only)?
  4. Check Unsloth Discord / GitHub issues for known workarounds

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions