nvidia-datacenter-bringup

Official

Bring up NVIDIA datacenter GPUs on Ubuntu 24.04.

Authorair-gapped
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Bring-up and validation of NVIDIA datacenter GPU hosts (HGX/DGX) on Ubuntu 24.04 LTS, including air-gapped and connected paths, Open-modules driver path, DOCA-OFED, NVLSM, Fabric Manager, MOK enrollment, and Dell iDRAC firmware prerequisites.

Core Features & Use Cases

  • End-to-end GPU datacenter bring-up: from OS prep through firmware, DOCA-OFED, driver, fabric stack, and GPU-driver validation.
  • Air-gap readiness: three-tier DOCA + CUDA mirroring guidance and Sneakernet file:// repos with GPG keys.
  • Fleet health validation: health-check script to confirm GPUs visible, Fabric Manager and NVLSM active, and NVLink fabric status.
  • Dell/B300 specific prerequisites: Dell firmware v1.4.30 baseline, iDRAC ExtendedReset for firmware activation, secure boot MOK enrollment for signed DKMS modules.
  • Collaboration with gpu-operator: pre-installed driver mode support and CUDA validator triage.

Quick Start

Follow the 10-step recipe to bring up a Dell HGX B300/B200 datacenter host and verify with the included health-check script.

Dependency Matrix

Required Modules

None required

Components

scriptsreferences

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: nvidia-datacenter-bringup
Download link: https://github.com/air-gapped/skills/archive/main.zip#nvidia-datacenter-bringup

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 510,000+ vetted skills library on demand.