lg-eval
CommunityRigorous agent evals across modes.
Authormarkhazlett
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Automates the setup and execution of robust evaluations for LangGraph/LangChain agents, enabling deterministic tests, experiment tracking, and dashboard insights across local and LangSmith-backed workflows.
Core Features & Use Cases
- Supports Local-only, LangSmith-backed, and Hybrid evaluation modes with dataset scaffolding, evaluators, and test harness integration.
- Generates reusable evaluators for trajectory, final-answer correctness, smoke checks, and optional structured-output checks; can upload datasets to LangSmith for dashboards.
- Use Case: A team adds evals to verify a new agent's behavior across tool usage and responses, then runs recordings in CI and reviews results in LangSmith dashboards.
Quick Start
Configure your agent and run the evaluator using the chosen mode (local-only, LangSmith-backed, or hybrid).
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: lg-eval Download link: https://github.com/markhazlett/agent-harness/archive/main.zip#lg-eval Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.