verl-agent-training
CommunityOne command trains a domain PPO agent end-to-end.
Software Engineering#agent training#qwen#huggingface#verl#dataset conversion#ppo training#gpu automation
AuthorCodekiing
Version1.0.0
Installs0
System Documentation
What problem does it solve?
It eliminates the manual, multi-step setup required to train a domain-specific PPO agent by automating environment preparation, dataset acquisition/conversion, and the VERL PPO training run with guardrails.
Core Features & Use Cases
- End-to-end VERL PPO training pipeline: Clones VERL, installs a fixed compatible dependency set, prepares dataset parquet files, then launches
python3 -m verl.trainer.main_ppo. - Interactive dataset selection by agent type: Uses your provided
AGENT_TYPEto search and choose an appropriate Hugging Face dataset, then converts it into VERL-readytrain.parquetandtest.parquet. - Crash recovery via “immediate repair” rules: Provides deterministic fallback steps for common failures, including OOM mitigation by reducing response length and micro-batch sizes.
Quick Start
Run this Skill in a blank directory by first saving the provided script as run_verl_skill.sh, then executing chmod +x run_verl_skill.sh && bash run_verl_skill.sh after setting AGENT_TYPE to your desired domain (e.g., math).
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: verl-agent-training Download link: https://github.com/Codekiing/Auto_VeRL_PPO_Skill/archive/main.zip#verl-agent-training Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.