eval-tool-use

Community

Audit LLM tool-use quality and sequencing

Authormajidraza1228
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Evaluate tool-use quality in LLM agents by assessing tool selection and argument construction. This guidance helps separate deterministic tool interactions from higher-level reasoning and promotes robust evaluation workflows.

Core Features & Use Cases

  • Tool selection audits: verify the agent chooses the correct tool for a given task.
  • Argument validation: ensure inputs to tools conform to expected schemas and constraints.
  • Sequencing and error handling: verify the order of tool calls and recovery after failures.
  • Use Case: Test a Claude agent with multiple tools to confirm it calls the right tool with appropriate arguments and handles tool errors gracefully.

Quick Start

Provide a trace of a tool-use session and run the evaluation to identify tool selection, argument, and sequencing issues.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: eval-tool-use
Download link: https://github.com/majidraza1228/eval-framework/archive/main.zip#eval-tool-use

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.