[docs] Add GenRL asset preparation recipe by Abecid · Pull Request #1456 · hao-ai-lab/FastVideo

Abecid · 2026-06-12T07:46:28Z

Purpose

Adds a reproducible GenRL HPSv3 + VideoAlign training recipe for Wan 2.1 T2V 1.3B on the modular fastvideo/train stack.

This PR makes the GenRL run reproducible without relying on a local Modal launcher or git submodules. Runtime assets are prepared through a public helper script, reward dependencies are pinned, and the vendored HPSv3/VideoAlign runtime code is aligned with the reward implementations used by the previous successful GenRL run.

Fixes: N/A

Changes

Added examples/train/prepare_genrl_assets.py to prepare:
- GenRL filtered prompt JSONL files
- KwaiVGI/VideoReward checkpoint under .cache/VideoReward
- optional reward-model preflight via --check-rewards
Added examples/train/requirements-genrl.txt with the GenRL reward-stack dependency pins.
Moved the GenRL Wan config to examples/train/configs/rl/wan/genrl_hpsv3_videoalign.yaml.
Updated examples/train/README.md with a non-Modal reproduction path.
Vendored only the HPSv3 and VideoAlign runtime files needed by the GenRL reward wrappers.
Aligned vendored reward runtimes with the exact successful-run upstream revisions:
- HPSv3: a2eb2ef2c7b5d91a566347a5825cf6d872122149
- VideoAlign: aba26b658fec7d9fd30c295187b548ea673c8769
Added reward-head load checks so HPSv3/VideoAlign fail fast instead of scoring with randomly initialized reward heads.
Added prompt dataset validation for missing files, Git LFS pointer files, malformed JSONL, and non-object JSON entries.
Kept modal_train_genrl.py local-only and ignored; it is not part of this PR.

Reproduction / Training Command

Install the GenRL reward-stack pins after the editable FastVideo install:

pip install -r examples/train/requirements-genrl.txt

Prepare prompts and reward checkpoints:

python examples/train/prepare_genrl_assets.py \
  --prompt-dir .cache/genrl_filtered_prompts \
  --genrl-cache-dir .cache/GenRL \
  --videoalign-dir .cache/VideoReward \
  --check-rewards

Launch the 4xGPU GenRL HPSv3 + VideoAlign training run:

WANDB_MODE=online \
WANDB_ENTITY=<your-wandb-entity> \
NUM_GPUS=4 \
VIDEOALIGN_CHECKPOINT_PATH=.cache/VideoReward \
FORCE_QWENVL_VIDEO_READER=opencv \
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True \
bash examples/train/run.sh \
  examples/train/configs/rl/wan/genrl_hpsv3_videoalign.yaml \
  --training.checkpoint.output_dir outputs/genrl_longcat

For the 41-step reproduction probe:

WANDB_MODE=online \
WANDB_ENTITY=<your-wandb-entity> \
NUM_GPUS=4 \
VIDEOALIGN_CHECKPOINT_PATH=.cache/VideoReward \
FORCE_QWENVL_VIDEO_READER=opencv \
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True \
bash examples/train/run.sh \
  examples/train/configs/rl/wan/genrl_hpsv3_videoalign.yaml \
  --training.loop.max_train_steps 41 \
  --training.checkpoint.training_state_checkpointing_steps 20 \
  --training.checkpoint.output_dir outputs/genrl_longcat_repro_rewardfix_41

Verification

Reward parity was checked against the previous successful GenRL run source state:

FastVideo commit: 17aecbe2dd07245333a1c0ea85f89b2b7a4a1f88
HPSv3 submodule: a2eb2ef2c7b5d91a566347a5825cf6d872122149
VideoAlign submodule: aba26b658fec7d9fd30c295187b548ea673c8769
VideoReward checkpoint: checkpoint-11352

Fixed-video reward parity after syncing the vendored runtime:

hpsv3_general delta:    ~1e-6
hpsv3_percentile delta: ~1e-6
videoalign_mq delta:    small residual runtime drift, about -0.03 to +0.04
videoalign_ta delta:    small residual runtime drift, about -0.04 to +0.02

Training reproduction run after the reward-runtime fix showed reward curves recovering compared with the earlier bad run.

Test Plan

python -m py_compile \
  examples/train/prepare_genrl_assets.py \
  fastvideo/train/methods/rl/utils/data.py \
  fastvideo/train/methods/rl/reward/hpsv3.py \
  fastvideo/train/methods/rl/reward/videoalign.py

python examples/train/prepare_genrl_assets.py --help

python -c "import yaml; from pathlib import Path; p=Path('examples/train/configs/rl/wan/genrl_hpsv3_videoalign.yaml'); yaml.safe_load(p.read_text()); print(f'parsed {p}')"

Test Results

python -m py_compile ...  # passed
python examples/train/prepare_genrl_assets.py --help  # passed
YAML parse check  # passed

pre-commit was not available in the local shell or the fastvideo conda env, so I could not run the full hook set locally.

Checklist

I ran pre-commit run --all-files and fixed all issues
I added or updated tests / validation for my changes
I updated documentation if needed
I considered GPU memory impact of my changes

gemini-code-assist

Code Review

This pull request introduces a helper script prepare_genrl_assets.py to download and validate GenRL prompts and VideoReward checkpoints, updates the corresponding training configuration with setup instructions, and adds robust file existence and Git LFS pointer checks to the dataset loader. The review feedback suggests enhancing the asset preparation script by handling Git LFS command failures, improving prompt validation to prevent crashes on malformed JSON, allowing checkpoint detection in the root directory, and resolving a repository ID inconsistency for the VideoReward model.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-12T07:48:07Z

+    prompt_count = 0
+    with path.open(encoding="utf-8") as f:
+        for line_no, raw_line in enumerate(f, start=1):
+            line = raw_line.strip()
+            if not line:
+                continue
+            if (
+                prompt_count == 0
+                and line_no == 1
+                and line.startswith("version https://git-lfs.github.com")
+            ):
+                raise RuntimeError(
+                    f"{path} is a Git LFS pointer, not real prompt JSON. "
+                    "Install git-lfs and rerun this script."
+                )
+            item = json.loads(line)
+            if item.get("prompt"):
+                prompt_count += 1


The prompt validation logic is prone to crashing or bypassing the Git LFS check if there are leading empty lines in the JSONL file. Additionally, if a line contains malformed JSON or a non-dictionary JSON value, json.loads or item.get will raise unhandled exceptions (JSONDecodeError or AttributeError) and crash the script with a cryptic traceback.

Using a saw_content flag (similar to the dataset loader) and wrapping the JSON parsing in a try-except block with explicit type checks makes the validation much more robust and user-friendly.

prompt_count = 0 saw_content = False with path.open(encoding="utf-8") as f: for line_no, raw_line in enumerate(f, start=1): line = raw_line.strip() if not line: continue if not saw_content and line.startswith("version https://git-lfs.github.com"): raise RuntimeError( f"{path} is a Git LFS pointer, not real prompt JSON. " "Install git-lfs and rerun this script." ) saw_content = True try: item = json.loads(line) except json.JSONDecodeError as e: raise RuntimeError( f"Malformed JSON on line {line_no} in {path}: {e}" ) from e if not isinstance(item, dict): raise RuntimeError( f"Expected a JSON object (dict) on line {line_no} in {path}, got {type(item).__name__}." ) if item.get("prompt"): prompt_count += 1

gemini-code-assist · 2026-06-12T07:48:08Z

+    if (genrl_dir / ".git").exists():
+        _run(["git", "lfs", "install"], cwd=genrl_dir)
+        _run(
+            [
+                "git",
+                "lfs",
+                "pull",
+                "-I",
+                "datasets/filtered_prompts/*",
+            ],
+            cwd=genrl_dir,
+        )


If git-lfs is not installed or configured on the system, running git lfs install or git lfs pull will fail with a generic subprocess.CalledProcessError and a cryptic traceback.

Wrapping these calls in a try-except block allows us to catch the error and provide a clear, actionable message to the user.

Suggested change

if (genrl_dir / ".git").exists():

_run(["git", "lfs", "install"], cwd=genrl_dir)

_run(

[

"git",

"lfs",

"pull",

"-I",

"datasets/filtered_prompts/*",

],

cwd=genrl_dir,

)

if (genrl_dir / ".git").exists():

try:

_run(["git", "lfs", "install"], cwd=genrl_dir)

_run(

[

"git",

"lfs",

"pull",

"-I",

"datasets/filtered_prompts/*",

],

cwd=genrl_dir,

)

except subprocess.CalledProcessError as exc:

raise RuntimeError(

"Failed to pull Git LFS assets. Ensure that `git-lfs` is installed "

"on your system and available in your PATH."

) from exc

gemini-code-assist · 2026-06-12T07:48:08Z

+def has_video_reward_checkpoint(root: Path) -> bool:
+    model_config = root / "model_config.json"
+    if not model_config.exists():
+        return False
+    for checkpoint in root.glob("checkpoint-*"):
+        if (checkpoint / "model.pth").exists():
+            return True
+        if (
+            (checkpoint / "adapter_model.safetensors").exists()
+            and (checkpoint / "non_lora_state_dict.pth").exists()
+        ):
+            return True
+    return False


The has_video_reward_checkpoint function only checks for checkpoints nested inside a checkpoint-* subdirectory. If a user manually downloads or extracts the checkpoint directly into the root directory (so that model.pth or adapter_model.safetensors is in the root), it will not be detected.

Adding a fallback check for the root directory directly makes the checkpoint detection much more robust.

def has_video_reward_checkpoint(root: Path) -> bool: model_config = root / "model_config.json" if not model_config.exists(): return False if (root / "model.pth").exists(): return True if ( (root / "adapter_model.safetensors").exists() and (root / "non_lora_state_dict.pth").exists() ): return True for checkpoint in root.glob("checkpoint-*"): if (checkpoint / "model.pth").exists(): return True if ( (checkpoint / "adapter_model.safetensors").exists() and (checkpoint / "non_lora_state_dict.pth").exists() ): return True return False

[docs] Add GenRL asset preparation recipe

34c973a

mergify Bot added type: docs Documentation only scope: training Training pipeline, methods, configs labels Jun 12, 2026

gemini-code-assist Bot reviewed Jun 12, 2026

View reviewed changes

Abecid added 7 commits June 15, 2026 22:31

[docs] make GenRL training reproducible

2174353

Fix train runtime

f79c1bf

Fix videoalign to use KawiVGI, add preflights, fix library versions

52849ad

Fix structure

b50b465

Add HPSv3 meta device fallback, chekcpoint placement check

9e4d39d

Use same reward implementation as previous scuccessful run

d4cbc0f

[docs]: keep GenRL Modal runner local

03f3e89

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[docs] Add GenRL asset preparation recipe#1456

[docs] Add GenRL asset preparation recipe#1456
Abecid wants to merge 8 commits into
hao-ai-lab:py/add_rlfrom
Abecid:abecid/genrl-repro-assets

Abecid commented Jun 12, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Abecid commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Changes

Reproduction / Training Command

Verification

Test Plan

Test Results

Checklist

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Abecid commented Jun 12, 2026 •

edited

Loading