Name: evaluation-benchmark
Availability: InStock
Author: Alex1980Alex

System Documentation

What problem does it solve?

Efficient and repeatable evaluation of search quality and RAG effectiveness across datasets, enabling objective comparisons and regression detection.

Core Features & Use Cases

Comprehensive metrics suite (precision, recall, NDCG, MRR) for retrieval quality.
RAG-focused evaluation with context relevance, grounding, and answer relevance scoring.
Automated pipelines to run benchmarks, collect results, and generate reports for multiple strategies (vector, BM25, hybrid).

Quick Start

Run an initial benchmark on your dataset to establish a baseline and compare strategies.

Please help me install this Skill: Name: evaluation-benchmark Download link: https://github.com/Alex1980Alex/1C-Enterprise_Framework/archive/main.zip#evaluation-benchmark Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

evaluation-benchmark

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper