Skip to content

RL-Align/vime

 
 

Repository files navigation

Vime

中文版 · Repository

Documentation Ask DeepWiki

Vime is an LLM post-training framework for RL scaling, built on slime. It keeps slime's training stack and data-generation design while using vLLM (with vllm-router) as the default rollout backend. Vime provides two core capabilities:

  1. High-performance training: Efficient training in various modes by connecting Megatron with vLLM;
  2. Flexible data generation: Arbitrary training data generation workflows through custom data generation interfaces and server-based engines.

Vime inherits broad model support from slime, including:

  • Qwen series (Qwen3.6, Qwen3.5, Qwen3Next, Qwen3MoE, Qwen3, Qwen2.5);
  • DeepSeek V3 series (DeepSeek V3, V3.1, DeepSeek R1);
  • Llama 3.

Discussion channels:

Positioning

The vLLM community horizontally supports many LLM post-training frameworks, including (in alphabetical order) NeMo RL, OpenRLHF, prime-rl, SkyRL, verl, and so on. We built the Vime project to seamlessly bring slime's proven training paradigm into the vLLM ecosystem, offering a production-ready bridge that aligns both projects' rapid release cycles. We hope that users with different needs can find the right vLLM-ecosystem choice for their workflows. The vLLM community will continue to support the vLLM integration in these post-training frameworks.

Table of Contents

Architecture Overview

arch

Module Descriptions:

  • training (Megatron): Responsible for the main training process, reads data from the Data Buffer, and synchronizes parameters to the rollout module after training.
  • rollout (vLLM + router): Launches vLLM inference engines and routes generation requests; produces new data (including rewards/verifier outputs) and stores it in the Data Buffer.
  • data buffer: A bridge module that manages prompt initialization, custom data, and rollout generation methods.

Quick Start

For a comprehensive quick start guide covering environment setup, data preparation, training startup, and key code analysis, please refer to:

We also provide examples for some use cases not covered in the quick start guide; please check examples.

Arguments Walkthrough

Arguments in Vime are divided into three categories:

  1. Megatron arguments: Vime reads all arguments in Megatron. You can configure Megatron by passing arguments like --tensor-model-parallel-size 2.
  2. vLLM arguments: vLLM server and engine options are exposed with a --vllm- prefix (for example, --vllm-gpu-memory-utilization). Router options live under two prefixes: vllm-router's native options are passed with --router- (for example, --router-policy round_robin, --router-request-timeout-secs), while Vime-side orchestration knobs that tell Vime where the router lives use --vllm-router- (--vllm-router-ip, --vllm-router-port). See vime/backends/vllm_utils/arguments.py for the full surface.
  3. Framework-specific arguments: Shared Vime orchestration flags (rollout GPUs, data paths, RL algorithms, etc.). Please refer to vime/utils/arguments.py.

--rollout-num-gpus-per-engine sets the tensor parallel size of each vLLM engine. The default rollout entry is vime.rollout.vllm_rollout.generate_rollout.

For complete usage instructions, please refer to the Usage Documentation.

Developer Guide

  • Contributions are welcome! If you have suggestions for new features, performance tuning, or feedback on user experience, feel free to submit an Issue or PR.

  • Use pre-commit to ensure code style consistency for your commits:

apt install pre-commit -y
pre-commit install

# run pre-commit to ensure code style consistency
pre-commit run --all-files --show-diff-on-failure --color=always

slime doc

Vime is derived from slime. The following upstream resources and in-repo guides still use the slime naming and remain the reference for shared concepts (Megatron integration, customization, advanced topics):

Documentation Ask DeepWiki

FAQ

For frequently asked questions, please see the Q&A

Acknowledgements

Vime builds on ideas and infrastructure from the open-source RL ecosystem. We especially thank the slime community, whose great work Vime is directly built on. We also thank SkyRL and verl, whose excellent work we referenced. Vime is maintained by the vLLM community.

Citation

@misc{vime,
  author       = {Vime Contributors},
  title        = {Vime: An LLM post-training framework with vLLM for RL Scaling},
  year         = {2026},
  howpublished = {\url{https://github.com/vllm-project/vime}},
  urldate      = {2026-06}
}

About

An LLM post-training framework with vLLM for RL Scaling

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 96.0%
  • Shell 2.8%
  • Other 1.2%