huggingface-accelerate

Community

Simplify distributed PyTorch training.

Authorkwasi-cpu
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill drastically simplifies the process of adding distributed training capabilities (multi-GPU, multi-node, DeepSpeed, FSDP) to any PyTorch script with minimal code changes.

Core Features & Use Cases

  • Unified API: Supports DeepSpeed, FSDP, DDP, and Megatron with a single interface.
  • Automatic Configuration: Handles device placement, mixed precision (FP16/BF16/FP8), and sharding automatically.
  • Quick Prototyping: Enables rapid iteration by reducing the boilerplate code for distributed setups.
  • Use Case: You have a PyTorch script for training a large language model on a single GPU. With just a few lines of code and a simple command, you can scale this script to run efficiently across multiple GPUs or even multiple machines.

Quick Start

Use the huggingface-accelerate skill to launch your training script 'train.py' on a multi-GPU setup.

Dependency Matrix

Required Modules

acceleratetorchtransformers

Components

scriptsreferencesassets

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: huggingface-accelerate
Download link: https://github.com/kwasi-cpu/hermes-agent/archive/main.zip#huggingface-accelerate

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.