ocr-and-documents
OfficialExtract text from any document.
AuthorNousResearch
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill automates the extraction of text and structured data from various document formats, including PDFs, scanned documents, and images, eliminating manual data entry and information retrieval bottlenecks.
Core Features & Use Cases
- Multi-format Support: Handles text-based PDFs, scanned PDFs (via OCR), DOCX, PPTX, and images.
- Remote URL Extraction: Efficiently extracts content from PDFs hosted online using
web_extract. - Advanced OCR & Layout Analysis: Utilizes
marker-pdffor high-accuracy OCR, table extraction, equation parsing, and layout understanding on scanned documents. - Lightweight Option: Employs
pymupdffor fast, low-dependency text extraction from text-based PDFs. - Use Case: Automatically extract all text and tables from a scanned research paper or a complex PDF report, making the information searchable and processable.
Quick Start
Use the ocr-and-documents skill to extract all text from the local file 'report.pdf'.
Dependency Matrix
Required Modules
pymupdfpymupdf4llmmarker-pdfpython-docxpython-pptx
Components
scriptsreferences
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: ocr-and-documents Download link: https://github.com/NousResearch/hermes-agent/archive/main.zip#ocr-and-documents Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.