ocr-and-documents

Official

Extract text from any document.

AuthorNousResearch
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill automates the extraction of text and structured data from various document formats, including PDFs, scanned documents, and images, eliminating manual data entry and information retrieval bottlenecks.

Core Features & Use Cases

  • Multi-format Support: Handles text-based PDFs, scanned PDFs (via OCR), DOCX, PPTX, and images.
  • Remote URL Extraction: Efficiently extracts content from PDFs hosted online using web_extract.
  • Advanced OCR & Layout Analysis: Utilizes marker-pdf for high-accuracy OCR, table extraction, equation parsing, and layout understanding on scanned documents.
  • Lightweight Option: Employs pymupdf for fast, low-dependency text extraction from text-based PDFs.
  • Use Case: Automatically extract all text and tables from a scanned research paper or a complex PDF report, making the information searchable and processable.

Quick Start

Use the ocr-and-documents skill to extract all text from the local file 'report.pdf'.

Dependency Matrix

Required Modules

pymupdfpymupdf4llmmarker-pdfpython-docxpython-pptx

Components

scriptsreferences

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: ocr-and-documents
Download link: https://github.com/NousResearch/hermes-agent/archive/main.zip#ocr-and-documents

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.