bump-base-image

Community

Bump Megatron CI’s NGC PyTorch image safely

Authoryo-steven
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Prevents CI breakage when upgrading Megatron-LM’s nvcr.io/nvidia/pytorch:<YY.MM>-py3 container by ensuring both GitHub and GitLab Docker pins are updated together and by guiding remediation steps after the bump.

Core Features & Use Cases

  • Dual pin bump (GitHub + GitLab): Updates docker/.ngc_version.dev for GitHub CI and the BASE_IMAGE matrix rows for GitLab CI so both pipelines use the same container tag.
  • Post-bump functional test handling: Recommends re-running functional tests, refreshing golden values when expected drift occurs, and isolating genuine regressions without blocking the bump.
  • Operational gotchas and guardrails: Addresses common failure modes like mismatched pins, hangs/timeouts/OOM after the bump, and per-commit /ok to test <sha> requirements for fork PRs.
  • Use case: You want to upgrade CI from 25.xx-py3 to 26.04-py3, but GitLab fails on main because the GitHub-only pin was updated first.

Quick Start

Tell the assistant the target NGC tag and whether you’re bumping only dev (e.g., “bump base image to 26.04-py3 for dev”), and follow it to update both CI pins and run the functional test workflow with golden-value refresh or broken-scope marking as needed.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: bump-base-image
Download link: https://github.com/yo-steven/skills-exploration-20260522/archive/main.zip#bump-base-image

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.