System Documentation

What problem does it solve?

This Skill turns a single multi-document PDF (such as a batch of scanned paperwork) into individual PDFs by detecting document boundaries from OCR-extracted text, so you can organize, name, and process each document separately.

Core Features & Use Cases

  • Detect document boundaries from searchable-page content (page-number resets, letterhead/sender changes, and address-block transitions), or from blank separator sheets in opt-in mode.
  • Consent-gated refinement when rule confidence is medium/ambiguous: it can ask an LLM to reconcile boundaries with the extracted OCR text only when permitted.
  • Human-in-the-loop split map: proposes a split, lets you merge/split/edit boundaries, then emits one PDF per document.
  • Post-processing only: it does not scan or OCR; it requires a searchable (OCR’d) PDF.

Quick Start

Ask the agent to split your batch by proposing boundaries from the OCR text in your input PDF and then outputting one PDF per detected document after you confirm.

Dependency Matrix

Required Modules

pypdf

Components

scripts

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: split-batch
Download link: https://github.com/xxthunder/xxthunder-agentic-skills/archive/main.zip#split-batch

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.