perf-sequence-packing
OfficialOptimize sequence training for large language models.
Software Engineering#sequence#training optimization#memory efficiency#packing#long-context#model fine-tuning
AuthorNVIDIA-NeMo
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill enables the validation and application of sequence packing techniques to improve training efficiency and long-context handling in large language models.
Core Features & Use Cases
- Enables offline packed SFT for LLM fine-tuning by configuring packed sequence specifications for stable training with extended context lengths.
- Supports in-batch packing for vision-language model fine-tuning, facilitating improved training throughput.
- Use Case: Adjust sequence lengths and packing parameters to enable longer context training or optimize memory usage during model finetuning for production environments.
Quick Start
Configure your training setup to use PackedSequenceSpecs for sequence length and packing specifications, then run the training process with these parameters to improve efficiency and handle long context sequences.
Dependency Matrix
Required Modules
None requiredComponents
scriptsreferencesassets
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: perf-sequence-packing Download link: https://github.com/NVIDIA-NeMo/Megatron-Bridge/archive/main.zip#perf-sequence-packing Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.