Name: llmeval-tracking
Availability: InStock
Author: C-Ross

System Documentation

What problem does it solve?

Track and diagnose the stability of LLM evaluation tests across CI runs, surfacing flaky tests and trends to speed up debugging and reliability.

Core Features & Use Cases

Result aggregation: collects per-run evaluation outcomes and stores them as JSONL for easy analysis.
Flake detection: identifies flaky tests and provides trend insights over time.
CI integration: works with daily CI workflows to fetch and report results from multiple runs.
Guided analysis: helps engineers pinpoint failing tests and compare against historical baselines.

Quick Start

Run the llmeval-tracker CLI to fetch the latest results and generate a stability report.

Please help me install this Skill: Name: llmeval-tracking Download link: https://github.com/C-Ross/LlamaOfFate/archive/main.zip#llmeval-tracking Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

llmeval-tracking

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper