identity-and-guardrails

Community

Enforce runtime safety and agent identity

Authortylerjrbuell
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Prevents unsafe or uncontrolled agent behavior by detecting prompt injection, masking PII, enforcing behavioral contracts, and providing runtime kill-switch and identity controls so agents can run safely in production environments.

Core Features & Use Cases

  • Prompt injection detection: Identify and block instructions intended to override system or developer-imposed rules.
  • PII masking and toxicity checks: Detect and mask sensitive personal data and toxic content before processing or returning results.
  • Behavioral contracts and tool restrictions: Define denied/allowed tools, max iterations, max tool calls, and output length limits for strict runtime enforcement.
  • Kill-switch and auditability: Support pause/resume/stop/terminate controls and record tool calls and guardrail decisions for compliance and debugging.
  • Use Case: Deploy a public-facing customer support agent that must never call destructive tools, must redact customer data, require disclosure of AI identity, and provide an audit trail for each session.

Quick Start

Use the identity-and-guardrails skill to build an agent that enforces prompt injection detection, PII masking, behavioral contracts, runtime pause/resume/stop controls, and audit logging.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: identity-and-guardrails
Download link: https://github.com/tylerjrbuell/reactive-agents-ts/archive/main.zip#identity-and-guardrails

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.