Name: launching-evals
Availability: InStock
Author: NVIDIA

System Documentation

What problem does it solve?

Running complex LLM evaluation workflows with Nemo Evaluator Launcher can be error-prone and hard to track across clusters; this skill provides a structured approach to launching, monitoring, debugging, and analyzing evaluations, including artifact export and log inspection.

Core Features & Use Cases

Launch evaluations with a config, monitor status and live progress, and collect results.
Debug failed runs by inspecting client and server logs, and export artifacts for analysis.
Analyze results and metrics across runs with benchmark-specific guidance stored under references.

Quick Start

Submit an evaluation config to start a run and monitor its progress.

Please help me install this Skill: Name: launching-evals Download link: https://github.com/NVIDIA/skills/archive/main.zip#launching-evals Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

launching-evals

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper