evaluate-rag

Community

Evaluate RAG pipeline quality

Authormarchatton
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill addresses the critical need to systematically evaluate and improve the performance of Retrieval-Augmented Generation (RAG) systems by dissecting the quality of both retrieval and generation components.

Core Features & Use Cases

  • Component-wise Evaluation: Separates the assessment of retrieval accuracy from generation faithfulness and relevance.
  • Dataset Curation: Provides methods for generating manual and synthetic QA pairs for robust retrieval testing.
  • Metric Implementation: Details the application of key metrics like Recall@k, Precision@k, MRR, and NDCG@k for different query types.
  • Chunking Optimization: Guides the process of tuning chunking strategies for improved retrieval performance.
  • Use Case: When a RAG system is underperforming, use this Skill to pinpoint whether the issue lies in retrieving the correct documents or in the LLM's ability to synthesize a faithful and relevant answer from the provided context.

Quick Start

Use the evaluate-rag skill to analyze the retrieval quality of the RAG pipeline by generating synthetic QA pairs for a given document.

Dependency Matrix

Required Modules

None required

Components

references

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: evaluate-rag
Download link: https://github.com/marchatton/agent-skills/archive/main.zip#evaluate-rag

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.