lg-eval

Community

Rigorous agent evals across modes.

Authormarkhazlett
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Automates the setup and execution of robust evaluations for LangGraph/LangChain agents, enabling deterministic tests, experiment tracking, and dashboard insights across local and LangSmith-backed workflows.

Core Features & Use Cases

  • Supports Local-only, LangSmith-backed, and Hybrid evaluation modes with dataset scaffolding, evaluators, and test harness integration.
  • Generates reusable evaluators for trajectory, final-answer correctness, smoke checks, and optional structured-output checks; can upload datasets to LangSmith for dashboards.
  • Use Case: A team adds evals to verify a new agent's behavior across tool usage and responses, then runs recordings in CI and reviews results in LangSmith dashboards.

Quick Start

Configure your agent and run the evaluator using the chosen mode (local-only, LangSmith-backed, or hybrid).

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: lg-eval
Download link: https://github.com/markhazlett/agent-harness/archive/main.zip#lg-eval

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.