Skip to content

OpenMOSS/FRoM-W1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

48 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

FRoM-W1: Towards General Humanoid Whole-Body Control with Language Instructions

FRoM-W1

Core Contributors: Peng Li, Zihan Zhuang, Yangfan Gao, Yi Dong, Sixian Li, Changhao Jiang, Tao Gui, Xipeng Qiu

The Humanoid Intelligence Team from FudanNLP and OpenMOSS

Project Webpage Paper on arXiv GitHub Code Hugging Face Data Hugging Face Model License

πŸ“Œ Status: Research release β€” the initial codebase, model checkpoints, datasets, and deployment framework are fully open-source. More powerful models and improved training recipes are under development. Contributions, issues, and PRs are welcome!

πŸ”₯ Introduction

FRoM-W1

For more information, refer to our project page and technical report.

Humanoid robots can perform diverse actions β€” greeting, dancing, backflipping β€” but these motions are typically hard-coded or task-specific. FRoM-W1 is an open-source framework for general humanoid whole-body motion control using natural language, operating in two stages:

  1. H-GPT β€” A language-driven whole-body motion generation model trained on large-scale human motion data. Uses Chain-of-Thought (CoT) prompting to improve instruction understanding and generalization.

  2. H-ACT β€” Retargets generated human motions into robot-specific actions, trains motion tracking policies via RL in simulation, and deploys them on real robots through a modular sim-to-real framework.

We evaluate FRoM-W1 on Unitree H1 and G1 robots. Results show strong performance on the HumanML3D-X benchmark for whole-body motion generation, and RL fine-tuning consistently improves both tracking accuracy and task success rates.

πŸ“‘ Roadmap

  • πŸŽ‰ H-GPT and H-ACT module codebases (H-GPT, H-ACT)
  • πŸŽ‰ Sim-to-real deployment framework RoboJuDo
  • CoT datasets (HumanML3D-X, Motion-X) and Ξ΄HumanML3D-X benchmark
  • SMPL-X baselines and eval model checkpoints (T2M, MotionDiffuse, MLD, T2M-GPT)
  • πŸŽ‰ Technical Report and Project Page
  • More powerful models (in progress)

πŸ’Ύ Datasets

Due to license restrictions, we cannot publicly share all data. Below are download and processing references.

H-GPT Module (click to expand)
Dataset Download Guide
HumanML3D Original HumanML3D repo β€” backup link
KIT-ML Original KIT-ML repo β€” backup link
Motion-X Original Motion-X repo β€” processing guide HERE
HumanML3D-X Process via the Motion-X repo + this guide. Uses original HumanML3D split with re-calculated mean/std. CoT data on HuggingFace.
Ξ΄HumanML3D-X Same as HumanML3D-X, with perturbed instruction variants on HuggingFace.

Expected structure for each dataset:

H-GPT/datasets/{dataset_name}/data/
β”œβ”€β”€ new_joint_vecs/
β”œβ”€β”€ new_joints/
β”œβ”€β”€ texts/
β”œβ”€β”€ cots/
β”œβ”€β”€ Mean.npy
β”œβ”€β”€ Std.npy
β”œβ”€β”€ all.txt
β”œβ”€β”€ train.txt
β”œβ”€β”€ train_val.txt
β”œβ”€β”€ val.txt
└── test.txt
H-ACT Module (click to expand)
Dataset Download Guide
AMASS Download and processing procedures from human2humanoid
AMASS-H1 Retargeted for Unitree H1 β€” box link (from human2humanoid)
AMASS-G1 Retargeted for Unitree G1 β€” link coming soon

πŸ“ Baselines

We retrained these SMPL-X baseline models and fully open-sourced them:

SMPL-X Baseline Codebases (forked repos):

Checkpoints (HuggingFace):

  • Eval model Β· T2M Β· MotionDiffuse Β· MLD Β· T2M-GPT (all SMPL-X format)

🧠 Models

H-GPT (click to expand)
Model Download
H-GPT w.o. CoT LoRA weights β€” merge with Llama-3.1 via this script
H-GPT LoRA weights β€” merge with Llama-3.1
H-GPT++ w.o. CoT LoRA weights β€” merge with Llama-3.1
H-GPT++ LoRA weights β€” merge with Llama-3.1
H-ACT (click to expand)
Policy Download
H1-Full Teacher (TBD), Student
H1-Clean Teacher (TBD), Student
G1-Full Teacher (TBD), Student
G1-Clean Teacher (TBD), Student

πŸ—οΈ Repository Structure

FRoM-W1/
β”œβ”€β”€ H-GPT/                         # Motion generation module
β”‚   β”œβ”€β”€ hGPT/                      #  Core package (models, data, metrics, losses)
β”‚   β”œβ”€β”€ configs/                   #  OmegaConf YAML configs (exp + arch)
β”‚   β”œβ”€β”€ scripts/                   #  Inference entry points
β”‚   └── motionx_processing.md      #  Dataset preparation guide
β”œβ”€β”€ H-ACT/                         # Action execution module
β”‚   β”œβ”€β”€ retarget/                  #  SMPL-X β†’ robot joint retargeting (submodule)
β”‚   β”œβ”€β”€ human2humanoid/            #  RL policy training framework (submodule)
β”‚   └── RoboJuDo/                  #  Sim-to-real deployment (submodule)
β”œβ”€β”€ assets/                        #  Images and media
β”œβ”€β”€ QUICKSTART.md                  #  Step-by-step setup guide
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ LICENSE                        #  Apache 2.0
└── README.md

πŸš€ Quick Start

The QUICKSTART.md guide walks through the full pipeline:

Text Instruction β†’ H-GPT (motion generation) β†’ Retarget (SMPL-X β†’ robot joints)
 β†’ Policy (RL training) β†’ RoboJuDo (sim-to-real deployment) β†’ Real Robot

Minimal inference

# 1. Setup
conda create -n fromw1 python=3.10
conda activate fromw1
pip install -r requirements.txt

# 2. Generate whole-body motion from text (H-GPT)
cd H-GPT
CUDA_VISIBLE_DEVICES=0 python -m scripts.demo \
  --cfg_assets ./configs/assets.yaml \
  --cfg configs/exp/1217_config_motionx_stage2_body_hands_llama_vqvae2kx1k_cotv3_t2mx.yaml \
  --task t2m \
  --example ./scripts/instructions.txt

# 3. Visualize
python -m hGPT.data.motionx.visualization.plot_3d_global \
  --path ./results/<result_folder>

# 4. Retarget to robot joints (H-ACT)
cd ../H-ACT/retarget
python main.py

For dataset preparation, model downloads, deps folder setup, and full deployment, follow QUICKSTART.md.

πŸ› οΈ Model Training and Evaluation

H-GPT

Three training stages controlled by the TRAIN.STAGE config field:

Stage TRAIN.STAGE Description
VQ-VAE "vae" Train whole-body motion tokenizer (convolutional encoder/decoder + vector quantization)
LM Pretrain "lm_pretrain" Finetune Llama-3.1-8B via LoRA to generate motion tokens (VQ-VAE frozen)
LM Instruct "lm_instruct" Instruction-tune with Chain-of-Thought data

See the H-GPT README for detailed training commands and evaluation protocols.

H-ACT

  • human2humanoid β€” RL-based motion tracking (primary framework)
  • Beyondmimic β€” CSV-formatted motion data required; convert with retarget/scripts/pkl_2_csv.py
  • TWIST β€” Alternative tracking strategy
  • RoboJuDo β€” Unified sim-to-real deployment with pretrained policies

πŸ™ Acknowledgements

We thank Biao Jiang for discussions on motion generation models, and Tairan He and Ziwen Zhuang for their help in motion tracking. We are grateful to all the open-source datasets and projects that made this work possible.

πŸ“„ Citation

If you find this work useful, please star ⭐ the repo and cite:

@article{DBLP:journals/corr/abs-2601-12799,
  author       = {Peng Li and
                  Zihan Zhuang and
                  Yangfan Gao and
                  Yi Dong and
                  Sixian Li and
                  Changhao Jiang and
                  Shihan Dou and
                  Zhiheng Xi and
                  Enyu Zhou and
                  Jixuan Huang and
                  Hui Li and
                  Jingjing Gong and
                  Xingjun Ma and
                  Tao Gui and
                  Zuxuan Wu and
                  Qi Zhang and
                  Xuanjing Huang and
                  Yu{-}Gang Jiang and
                  Xipeng Qiu},
  title        = {FRoM-W1: Towards General Humanoid Whole-Body Control with Language
                  Instructions},
  journal      = {CoRR},
  volume       = {abs/2601.12799},
  year         = {2026},
  url          = {https://doi.org/10.48550/arXiv.2601.12799},
  doi          = {10.48550/ARXIV.2601.12799},
  eprinttype   = {arXiv},
  eprint       = {2601.12799},
  timestamp    = {Tue, 24 Mar 2026 08:45:06 +0100},
  biburl       = {https://dblp.org/rec/journals/corr/abs-2601-12799.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

Releases

No releases published

Packages

 
 
 

Contributors