Name: codex-promptfoo-agentic-eval
Availability: InStock
Author: scouzi1966

System Documentation

What problem does it solve?

Run and interpret the Promptfoo-based AFM agentic evaluation suite to enable end-to-end functional QA and model-quality assessment for agentic workflows.

Core Features & Use Cases

Run, expand, and interpret the Promptfoo agentic evaluation suite for AFM validation and quality measurement.
Distinguish failure types and report provenance (afm_bug, model_quality, harness_bug) with explicit provenance tags (afm_internal, primary_source, public_benchmark_inspired, synthetic).
Manage multiple harness configurations (structured, structured-stress, toolcall, agentic, frameworks, opencode) and review results from prepared matrices and datasets.
Provide ready-to-use prompts, configs, and datasets, with references to test reports and failure classifications for analysis.

Quick Start

Start by running the harness with the agentic profile to begin evaluating AFM.

Please help me install this Skill: Name: codex-promptfoo-agentic-eval Download link: https://github.com/scouzi1966/maclocal-api/archive/main.zip#codex-promptfoo-agentic-eval Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

codex-promptfoo-agentic-eval

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper