model-comparison
OfficialRun fair multi-model evaluations and produce Pareto facts.
Data & Analytics#model comparison#PSI drift#binary scoring#Pareto frontier#DeLong test#data split#financial ML
Authoraliyun
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill removes guesswork from “which algorithm is better” by running multiple models under the same data split and producing an objective, multi-metric comparison report for financial ML use.
Core Features & Use Cases
- Fair multi-model evaluation: runs XGBoost, LR (WoE), and DNN side-by-side on the same train/val/OOT split.
- Multi-dimensional, non-subjective reporting: computes OOT AUC/KS/BCR/Brier, plus KS gap and PSI drift, and performs DeLong significance tests (AUC).
- Pareto frontier “candidate set”: identifies algorithms on the Pareto front (no主观加权, no gatekeeping), then leaves final selection to LLM/business reasoning.
- Supports common financial scoring scenarios: general, scorecard (LR-friendly), fraud (capture-focused), stability-first (drift-focused).
Quick Start
Ask the AI to run the Skill with your dataset by providing the data path, target column, OOT time column rules, and output directory so it generates the multi-algorithm comparison report and artifacts.
Dependency Matrix
Required Modules
jsonloggingnumpypandasscipysklearntorchxgboostoptbinningjoblib
Components
scripts
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: model-comparison Download link: https://github.com/aliyun/qwen-dianjin/archive/main.zip#model-comparison Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.