model-comparison

Official

Run fair multi-model evaluations and produce Pareto facts.

Authoraliyun
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill removes guesswork from “which algorithm is better” by running multiple models under the same data split and producing an objective, multi-metric comparison report for financial ML use.

Core Features & Use Cases

  • Fair multi-model evaluation: runs XGBoost, LR (WoE), and DNN side-by-side on the same train/val/OOT split.
  • Multi-dimensional, non-subjective reporting: computes OOT AUC/KS/BCR/Brier, plus KS gap and PSI drift, and performs DeLong significance tests (AUC).
  • Pareto frontier “candidate set”: identifies algorithms on the Pareto front (no主观加权, no gatekeeping), then leaves final selection to LLM/business reasoning.
  • Supports common financial scoring scenarios: general, scorecard (LR-friendly), fraud (capture-focused), stability-first (drift-focused).

Quick Start

Ask the AI to run the Skill with your dataset by providing the data path, target column, OOT time column rules, and output directory so it generates the multi-algorithm comparison report and artifacts.

Dependency Matrix

Required Modules

jsonloggingnumpypandasscipysklearntorchxgboostoptbinningjoblib

Components

scripts

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: model-comparison
Download link: https://github.com/aliyun/qwen-dianjin/archive/main.zip#model-comparison

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.