knowledge-paper-extractor
OfficialExtract metadata and references from scientific PDFs.
Authorskaile-ai
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Extracts structured metadata (title, authors, abstract) and bibliographic references from scientific PDFs.
Core Features & Use Cases
- Metadata extraction: detects title, authors, and abstract from the first page using heuristic analysis with pdfminer.six and pulls embedded PDF metadata via pypdf.
- References extraction: collects cited references using refextract and prepares them for downstream resolution.
- Output generation: creates metadata.json, references.json, references.csl.json, and summary.md to support literature reviews and knowledge management.
- Use Case: researchers processing literature corpora, building bibliographies, or preparing systematic reviews.
Quick Start
Provide a PDF file and an output directory to generate the structured outputs (metadata.json, references.json, references.csl.json, and summary.md).
Dependency Matrix
Required Modules
typer>=0.12.0refextract>=0.2.5setuptools<81pdfminer.six>=20221105pypdf>=4.0httpx>=0.27.0
Components
scriptsreferencesassets
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: knowledge-paper-extractor Download link: https://github.com/skaile-ai/ai-assets/archive/main.zip#knowledge-paper-extractor Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 510,000+ vetted skills library on demand.