pdf-vision
CommunityRead PDFs with perfect visual & text accuracy.
Authorlidge-jun
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill enables AI agents to accurately read and analyze complex PDF documents, such as exams and reports, by combining precise text extraction with high-resolution image rendering to overcome OCR limitations and visual layout challenges.
Core Features & Use Cases
- Hybrid Extraction: Extracts both exact text content and a rendered PNG image of each page.
- Visual & Textual Analysis: Combines image analysis for layout and visual elements with extracted text for content accuracy.
- Exam Solving: Ideal for solving visual exam questions where layout and precise wording are critical.
- Legal Revision Check: Integrates with the
searchskill to check for recent legal revisions relevant to the PDF content.
Quick Start
Extract the image and text from page 3 of the PDF located at /path/to/document.pdf.
Dependency Matrix
Required Modules
pypdfpypdfium2
Components
scripts
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: pdf-vision Download link: https://github.com/lidge-jun/cli-jaw-skills/archive/main.zip#pdf-vision Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.