ArXiv Agentic Verifier

Community

Find code bugs with targeted edge-case tests.

AuthorWanli-Lee
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Verifying competitive-coding solutions is hard because edge cases and logic flaws often escape simple samples, so this Skill helps you automatically create discriminative tests and check whether candidate code is correct.

Core Features & Use Cases

  • Analyze Code Logic: Uses an LLM to reason about the problem statement and candidate code to identify likely failure modes.
  • Generate Targeted Test Cases: Produces specific inputs plus expected outputs aimed at breaking incorrect logic (not random sampling).
  • Execute and Verify: Runs the candidate code with the generated input and reports pass/fail based on output equality.

Use case examples: verifying a Python/JavaScript solution in a coding interview harness, diagnosing a wrong-answer submission by generating a counterexample, or stress-testing a small algorithm implementation against tricky boundary conditions.

Quick Start

Create an AgenticVerifier instance and call verify(problem, code, language) to generate a discriminative test case, execute the candidate program, and return whether it passed.

Dependency Matrix

Required Modules

openaiaxios

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: ArXiv Agentic Verifier
Download link: https://github.com/Wanli-Lee/CUA-Claw-Harness/archive/main.zip#arxiv-agentic-verifier

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.