[docs] Add GenRL asset preparation recipe#1456
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a helper script prepare_genrl_assets.py to download and validate GenRL prompts and VideoReward checkpoints, updates the corresponding training configuration with setup instructions, and adds robust file existence and Git LFS pointer checks to the dataset loader. The review feedback suggests enhancing the asset preparation script by handling Git LFS command failures, improving prompt validation to prevent crashes on malformed JSON, allowing checkpoint detection in the root directory, and resolving a repository ID inconsistency for the VideoReward model.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| prompt_count = 0 | ||
| with path.open(encoding="utf-8") as f: | ||
| for line_no, raw_line in enumerate(f, start=1): | ||
| line = raw_line.strip() | ||
| if not line: | ||
| continue | ||
| if ( | ||
| prompt_count == 0 | ||
| and line_no == 1 | ||
| and line.startswith("version https://git-lfs.github.com") | ||
| ): | ||
| raise RuntimeError( | ||
| f"{path} is a Git LFS pointer, not real prompt JSON. " | ||
| "Install git-lfs and rerun this script." | ||
| ) | ||
| item = json.loads(line) | ||
| if item.get("prompt"): | ||
| prompt_count += 1 |
There was a problem hiding this comment.
The prompt validation logic is prone to crashing or bypassing the Git LFS check if there are leading empty lines in the JSONL file. Additionally, if a line contains malformed JSON or a non-dictionary JSON value, json.loads or item.get will raise unhandled exceptions (JSONDecodeError or AttributeError) and crash the script with a cryptic traceback.
Using a saw_content flag (similar to the dataset loader) and wrapping the JSON parsing in a try-except block with explicit type checks makes the validation much more robust and user-friendly.
prompt_count = 0
saw_content = False
with path.open(encoding="utf-8") as f:
for line_no, raw_line in enumerate(f, start=1):
line = raw_line.strip()
if not line:
continue
if not saw_content and line.startswith("version https://git-lfs.github.com"):
raise RuntimeError(
f"{path} is a Git LFS pointer, not real prompt JSON. "
"Install git-lfs and rerun this script."
)
saw_content = True
try:
item = json.loads(line)
except json.JSONDecodeError as e:
raise RuntimeError(
f"Malformed JSON on line {line_no} in {path}: {e}"
) from e
if not isinstance(item, dict):
raise RuntimeError(
f"Expected a JSON object (dict) on line {line_no} in {path}, got {type(item).__name__}."
)
if item.get("prompt"):
prompt_count += 1| if (genrl_dir / ".git").exists(): | ||
| _run(["git", "lfs", "install"], cwd=genrl_dir) | ||
| _run( | ||
| [ | ||
| "git", | ||
| "lfs", | ||
| "pull", | ||
| "-I", | ||
| "datasets/filtered_prompts/*", | ||
| ], | ||
| cwd=genrl_dir, | ||
| ) |
There was a problem hiding this comment.
If git-lfs is not installed or configured on the system, running git lfs install or git lfs pull will fail with a generic subprocess.CalledProcessError and a cryptic traceback.
Wrapping these calls in a try-except block allows us to catch the error and provide a clear, actionable message to the user.
| if (genrl_dir / ".git").exists(): | |
| _run(["git", "lfs", "install"], cwd=genrl_dir) | |
| _run( | |
| [ | |
| "git", | |
| "lfs", | |
| "pull", | |
| "-I", | |
| "datasets/filtered_prompts/*", | |
| ], | |
| cwd=genrl_dir, | |
| ) | |
| if (genrl_dir / ".git").exists(): | |
| try: | |
| _run(["git", "lfs", "install"], cwd=genrl_dir) | |
| _run( | |
| [ | |
| "git", | |
| "lfs", | |
| "pull", | |
| "-I", | |
| "datasets/filtered_prompts/*", | |
| ], | |
| cwd=genrl_dir, | |
| ) | |
| except subprocess.CalledProcessError as exc: | |
| raise RuntimeError( | |
| "Failed to pull Git LFS assets. Ensure that `git-lfs` is installed " | |
| "on your system and available in your PATH." | |
| ) from exc |
| def has_video_reward_checkpoint(root: Path) -> bool: | ||
| model_config = root / "model_config.json" | ||
| if not model_config.exists(): | ||
| return False | ||
| for checkpoint in root.glob("checkpoint-*"): | ||
| if (checkpoint / "model.pth").exists(): | ||
| return True | ||
| if ( | ||
| (checkpoint / "adapter_model.safetensors").exists() | ||
| and (checkpoint / "non_lora_state_dict.pth").exists() | ||
| ): | ||
| return True | ||
| return False |
There was a problem hiding this comment.
The has_video_reward_checkpoint function only checks for checkpoints nested inside a checkpoint-* subdirectory. If a user manually downloads or extracts the checkpoint directly into the root directory (so that model.pth or adapter_model.safetensors is in the root), it will not be detected.
Adding a fallback check for the root directory directly makes the checkpoint detection much more robust.
def has_video_reward_checkpoint(root: Path) -> bool:
model_config = root / "model_config.json"
if not model_config.exists():
return False
if (root / "model.pth").exists():
return True
if (
(root / "adapter_model.safetensors").exists()
and (root / "non_lora_state_dict.pth").exists()
):
return True
for checkpoint in root.glob("checkpoint-*"):
if (checkpoint / "model.pth").exists():
return True
if (
(checkpoint / "adapter_model.safetensors").exists()
and (checkpoint / "non_lora_state_dict.pth").exists()
):
return True
return False
Purpose
Adds a reproducible GenRL HPSv3 + VideoAlign training recipe for Wan 2.1 T2V 1.3B on the modular
fastvideo/trainstack.This PR makes the GenRL run reproducible without relying on a local Modal launcher or git submodules. Runtime assets are prepared through a public helper script, reward dependencies are pinned, and the vendored HPSv3/VideoAlign runtime code is aligned with the reward implementations used by the previous successful GenRL run.
Fixes: N/A
Changes
examples/train/prepare_genrl_assets.pyto prepare:KwaiVGI/VideoRewardcheckpoint under.cache/VideoReward--check-rewardsexamples/train/requirements-genrl.txtwith the GenRL reward-stack dependency pins.examples/train/configs/rl/wan/genrl_hpsv3_videoalign.yaml.examples/train/README.mdwith a non-Modal reproduction path.a2eb2ef2c7b5d91a566347a5825cf6d872122149aba26b658fec7d9fd30c295187b548ea673c8769modal_train_genrl.pylocal-only and ignored; it is not part of this PR.Reproduction / Training Command
Install the GenRL reward-stack pins after the editable FastVideo install:
Prepare prompts and reward checkpoints:
Launch the 4xGPU GenRL HPSv3 + VideoAlign training run:
For the 41-step reproduction probe:
Verification
Reward parity was checked against the previous successful GenRL run source state:
17aecbe2dd07245333a1c0ea85f89b2b7a4a1f88a2eb2ef2c7b5d91a566347a5825cf6d872122149aba26b658fec7d9fd30c295187b548ea673c8769checkpoint-11352Fixed-video reward parity after syncing the vendored runtime:
Training reproduction run after the reward-runtime fix showed reward curves recovering compared with the earlier bad run.
Test Plan
Test Results
pre-commitwas not available in the local shell or thefastvideoconda env, so I could not run the full hook set locally.Checklist
pre-commit run --all-filesand fixed all issues