pdf-asset-extractor
CommunityTurn PDFs into searchable assets.
Authoru9401066
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Decomposing PDFs into queryable assets (images, tables, and text) and constructing a cross-document knowledge graph to enable AI agents to access precise information across documents.
Core Features & Use Cases
- Ingest PDFs with dual-engine extraction (PyMuPDF for speed and Marker for high-precision structure) to output figures, tables, and sections.
- Build a knowledge graph and Mermaid diagrams for cross-document reasoning and visualization.
- Retrieve assets on demand (images as base64, tables as Markdown, sections and full text) to support RAG-enabled workflows.
- Use Case: Automate extraction from a batch of research papers to populate a searchable knowledge base and enable cross-document queries.
Quick Start
Ingest a PDF to extract figures, tables, and text and build a knowledge graph for cross-document reasoning.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: pdf-asset-extractor Download link: https://github.com/u9401066/asset-aware-mcp/archive/main.zip#pdf-asset-extractor Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.