ArXiv Agentic Verifier
CommunityFind code bugs with targeted edge-case tests.
System Documentation
What problem does it solve?
Verifying competitive-coding solutions is hard because edge cases and logic flaws often escape simple samples, so this Skill helps you automatically create discriminative tests and check whether candidate code is correct.
Core Features & Use Cases
- Analyze Code Logic: Uses an LLM to reason about the problem statement and candidate code to identify likely failure modes.
- Generate Targeted Test Cases: Produces specific inputs plus expected outputs aimed at breaking incorrect logic (not random sampling).
- Execute and Verify: Runs the candidate code with the generated input and reports pass/fail based on output equality.
Use case examples: verifying a Python/JavaScript solution in a coding interview harness, diagnosing a wrong-answer submission by generating a counterexample, or stress-testing a small algorithm implementation against tricky boundary conditions.
Quick Start
Create an AgenticVerifier instance and call verify(problem, code, language) to generate a discriminative test case, execute the candidate program, and return whether it passed.
Dependency Matrix
Required Modules
Components
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: ArXiv Agentic Verifier Download link: https://github.com/Wanli-Lee/CUA-Claw-Harness/archive/main.zip#arxiv-agentic-verifier Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.