pymupdf
CommunityTurn PDFs into ML-friendly Markdown and JSON.
AuthorBlake-John
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This skill transforms PDFs into machine-friendly outputs (Markdown/JSON/Text) to streamline data extraction and content processing for downstream LLM/RAG workflows.
Core Features & Use Cases
- Convert PDFs to Markdown for readable content extraction and downstream processing.
- Generate JSON with layout information for structured indexing and retrieval.
- Extract plain text and tables from PDFs for data analytics and integration with other tools.
- Support page selection and header/footer removal, batch processing, and CLI/Python API usage.
Quick Start
Convert a PDF to Markdown using the CLI to obtain an LLM-friendly output.
Dependency Matrix
Required Modules
pymupdfpymupdf4llmpymupdf-layout
Components
scripts
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: pymupdf Download link: https://github.com/Blake-John/agent-config/archive/main.zip#pymupdf Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.