zep-eval-harness

Name: zep-eval-harness
Availability: InStock
Author: getzep

Official

Manage end-to-end Zep eval harness workflows.

Software Engineering #automation #testing #QA #harness #eval #Zep #graph-inspection

Authorgetzep

Version1.0.0

Installs0

System Documentation

What problem does it solve?

Zep Eval Harness provides an end-to-end evaluation framework to test Zep's memory retrieval and QA capabilities by feeding conversations, telemetry, and documents into graphs and scoring retrieved contexts against gold answers.

Core Features & Use Cases

Chunk documents: split documents into chunks and generate contextualized summaries for LLM consumption.
Ingest users and documents: create user graphs, ingest conversations and telemetry, and push document chunks into a standalone graph for evaluation.
Evaluate: run test cases, search graphs for context, generate LLM responses, and grade answers against golden solutions.
Inspect graphs: use zep_graph_inspect.py to view graph contents and diagnose retrieval or context gaps.
Compare runs: analyze aggregate and per-category metrics across different runs/configs.
Metrics and diagnostics: focus on Context Complete (PRIMARY) and Answer Accuracy (SECONDARY) to assess retrieval quality and response correctness.

Quick Start

Run the Zep eval harness end-to-end to chunk documents, ingest users and documents, run evaluations, inspect graphs, and analyze results.

zep-eval-harness

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper