report_evaluation

Community

Automated evaluation of weekly reports.

Authorxueqingpeng
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Evaluates a report_generation run for one ticker / model combination by reading generated WEEKLY Markdown reports and offline DuckDB market data, then scores the run across five dimensions and aggregates run-level backtest metrics. Writes one JSON result plus one structured Markdown summary to results/report_evaluation/.

Core Features & Use Cases

  • Recomputes ground-truth metrics for weekly reports using the generation's MCP logic to ensure alignment with produced outputs.
  • Performs per-report scoring and run-level backtests across ticker/model combinations using offline data and the MCP tooling.
  • Produces a machine-readable JSON artifact plus a human-readable Markdown summary for auditing and comparison.

Quick Start

Run a completed evaluation against a set of generated reports and persist the results with upsert_evaluation to produce JSON and Markdown artifacts.

Dependency Matrix

Required Modules

duckdbpandasnumpypandas_tafastmcppydantic

Components

scripts

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: report_evaluation
Download link: https://github.com/xueqingpeng/trading-analysis/archive/main.zip#report-evaluation

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.