document-parsers

Community

Unlock any document's content.

AuthorHokageZ
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill tackles the challenge of extracting and structuring information locked within various document formats, making data accessible and usable for analysis, RAG, and more.

Core Features & Use Cases

  • Multi-Format Parsing: Handles PDFs, DOCX, HTML, and Markdown files.
  • Advanced Extraction: Supports text, tables, and metadata extraction.
  • AI-Powered Options: Integrates with LlamaParse for superior accuracy on complex documents.
  • RAG Ready: Includes tools for document chunking suitable for embedding.
  • Use Case: You need to build a RAG system using a collection of research papers (PDFs) and technical documentation (HTML, DOCX). This Skill provides the tools to parse all these documents, extract relevant text and tables, and chunk them appropriately for your vector database.

Quick Start

Use the document-parsers skill to extract all text and tables from the file 'report.pdf'.

Dependency Matrix

Required Modules

pypdf2pdfplumberpython-docxbeautifulsoup4lxmlunstructured[local-inference]pytesseractpdf2imagellama-parsellama-index-core

Components

scriptsreferencestemplatesexamples

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: document-parsers
Download link: https://github.com/HokageZ/JOB-HUNTER/archive/main.zip#document-parsers

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.