quality-eval

Community

Durable qualitative evaluation for slow results.

AuthorHonigbart
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Qualitative validation can be slow to organize and reproduce across fresh sessions. This skill provides a durable workflow for creating and running eval briefs, and scaffolds the artifacts under .ai/evals/<slug>/ to preserve context, runs, and conclusions.

Core Features & Use Cases

  • Durable eval briefs with a structured brief.md
  • Two modes: Prepare to configure and Run/Resume to execute validations
  • Separate run logs and results under .ai/evals/<slug>/runs/ and .ai/evals/<slug>/results.md
  • Easy slug resolution from the feature or prompt being evaluated
  • Flexible to compare multiple variants, prompts, or UX flows

Quick Start

Create an eval slug and run Prepare mode to set up a durable eval brief in a fresh session.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: quality-eval
Download link: https://github.com/Honigbart/honeyflow/archive/main.zip#quality-eval

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.