[perf] Default Wan VAE decode to bf16 (lossless, faster)#1472
[perf] Default Wan VAE decode to bf16 (lossless, faster)#1472Mister-Raggs wants to merge 1 commit into
Conversation
Wan's pipeline configs left vae_precision at the inherited PipelineConfig
default of fp32, while the DiT already runs bf16 and the *same* AutoencoderKLWan
runs bf16 in Cosmos-Predict2.5 and fp16 in Cosmos. Measured on a real Wan latent
(decoded fp32 vs bf16 in one process):
- MS-SSIM(bf16, fp32) = 0.9999 on the identical latent (no quality cost)
- ~1.2-1.3x faster VAE decode (~5-10% end-to-end on decode-bound few-step
models; free quality-wise on full-step)
fp32 here was the inherited default, not a Wan-specific requirement; this aligns
Wan with its sibling configs. Also updates the two legacy wan_*.json configs.
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🔴 PR merge requirementsWaiting for
This rule is failing.
|
There was a problem hiding this comment.
Code Review
This pull request updates the VAE precision configuration for Wan models from 'fp32' to 'bf16' in 'fastvideo/configs/pipelines/wan.py', 'fastvideo/configs/wan_1.3B_t2v_pipeline.json', and 'fastvideo/configs/wan_14B_i2v_480p_pipeline.json' to improve performance while remaining effectively lossless. There are no review comments, so I have no feedback to provide.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
There was a problem hiding this comment.
Pull request overview
This PR updates Wan pipeline configuration defaults so the Wan VAE runs in bf16 instead of inheriting the base PipelineConfig default of fp32, reducing VAE decode cost while maintaining effectively identical output quality.
Changes:
- Set
WanT2V480PConfig.vae_precisiondefault to"bf16"(covers all Wan variants inheriting from it). - Update legacy Wan JSON pipeline configs to use
"vae_precision": "bf16"for consistency. - Add an explanatory comment in
wan.pydocumenting the rationale for bf16.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| fastvideo/configs/pipelines/wan.py | Changes Wan base pipeline config default vae_precision to bf16 and documents the rationale. |
| fastvideo/configs/wan_14B_i2v_480p_pipeline.json | Updates legacy I2V 480p JSON config to set vae_precision to bf16. |
| fastvideo/configs/wan_1.3B_t2v_pipeline.json | Updates legacy T2V JSON config to set vae_precision to bf16. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # bf16 VAE decode is effectively lossless (MS-SSIM 0.9999 vs fp32 on an | ||
| # identical latent) and faster; the same AutoencoderKLWan already runs bf16 | ||
| # in Cosmos-Predict2.5 and fp16 in Cosmos. fp32 here was just the inherited | ||
| # PipelineConfig default, not a Wan-specific requirement. |
Summary
Wan's pipeline configs left
vae_precision="fp32"— the inheritedPipelineConfigdefault — even though the DiT already runsbf16and the sameAutoencoderKLWanrunsbf16in Cosmos-Predict2.5 andfp16in Cosmos. fp32 VAE decode is the majority cost on bandwidth-limited devices and gives no quality benefit here; it was the inherited default, not a Wan-specific requirement.This sets
WanT2V480PConfig.vae_precisiontobf16. Every Wan variant derives from it (T2V/I2V, 720P, Wan2.2, LucyEdit, SelfForcing) with no override, so the single change covers them all. The two legacywan_*.jsonconfigs are updated for consistency.Evidence
Measured by decoding one real Wan latent fp32 vs bf16 in the same process (no denoise non-determinism, no codec noise):
The random-latent control gave the same 0.9999, and the same VAE already ships in bf16 under Cosmos-Predict2.5, so the change is well-precedented.
Test plan