byob
OfficialBuild and evaluate BYOB benchmarks for LLMs.
AuthorNVIDIA
Version1.0.0
Installs0
System Documentation
What problem does it solve?
BYOB enables researchers and developers to build, customize, and evaluate large language model benchmarks using the BYOB decorator framework, providing reproducible evaluation workflows.
Core Features & Use Cases
- Stepwise workflow guiding users through 5 steps to construct and assess bespoke benchmarks.
- BYOB API integration with datasets, prompts, and scoring methods, enabling repeatable experiments and reporting.
- LLM-as-Judge support and built-in scorers for objective and subjective evaluation.
Quick Start
Guide the user through 5 steps to build and evaluate a BYOB benchmark from a dataset.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: byob Download link: https://github.com/NVIDIA/skills/archive/main.zip#byob Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.