isc-bench

Name: isc-bench
Availability: InStock
Author: wuyoscar

Community

Assess frontier-LLM safety with TVD workflows.

Education & Research #research #benchmark #workflows #jailbreak #llm-safety #safety-evaluation #tvd

Authorwuyoscar

Version1.0.0

Installs0

System Documentation

What problem does it solve?

ISC-Bench provides a structured framework to evaluate Internal Safety Collapse (ISC) in frontier LLMs using the TVD (Task-Validator-Data) paradigm. It enables researchers to reproduce safety evaluation workflows, compare cross-model performance, and study how legitimate professional tasks can produce harmful outputs as a function of workflow constraints. It also supports agentic evaluation paths and multi-domain benchmarks to foster methodological rigor.

Core Features & Use Cases

TVD-driven evaluation of frontier LLM safety across multiple domains (AI safety, biology, chemistry, cybersecurity, etc.).
Reproducible pipelines for Task, Validator, and Data, plus optional agentic execution modes.
Benchmarking across templates, experiments, and community reproductions to compare model robustness and safety.

Quick Start

Clone the repository, install uv, configure OpenRouter API keys, and run the single-turn TVD workflow as described in the Quick Start.

isc-bench

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper