Name: redteam-autoresearch
Availability: InStock
Author: superagent-ai

System Documentation

What problem does it solve?

This skill orchestrates a bounded red-team autoresearch loop to generate labeled guardrail training data for LLM safety. It enables attackers and judges to simulate real-world adversarial probing under explicit authorization, producing studyable datasets locally.

Core Features & Use Cases

End-to-end autoresearch workflow: target profiling, seed research, batch generation, deterministic mutators, model querying, judging with StrongREJECT, recording, and archive-based novelty tracking.
Data export for guardrails: prepares labeled prompts, responses, and metadata for training guardrail classifiers and detectors.
Benchmark-ready: supports holdout seeds, difficulty strata, and macro/micro ASR reporting for model comparisons.

Quick Start

Set up a run workspace and start the bounded red-team autoresearch workflow to generate guardrail training data.

Please help me install this Skill: Name: redteam-autoresearch Download link: https://github.com/superagent-ai/skills/archive/main.zip#redteam-autoresearch Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

redteam-autoresearch

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper