knowledge-paper-extractor

Official

Extract metadata and references from scientific PDFs.

Authorskaile-ai
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Extracts structured metadata (title, authors, abstract) and bibliographic references from scientific PDFs.

Core Features & Use Cases

  • Metadata extraction: detects title, authors, and abstract from the first page using heuristic analysis with pdfminer.six and pulls embedded PDF metadata via pypdf.
  • References extraction: collects cited references using refextract and prepares them for downstream resolution.
  • Output generation: creates metadata.json, references.json, references.csl.json, and summary.md to support literature reviews and knowledge management.
  • Use Case: researchers processing literature corpora, building bibliographies, or preparing systematic reviews.

Quick Start

Provide a PDF file and an output directory to generate the structured outputs (metadata.json, references.json, references.csl.json, and summary.md).

Dependency Matrix

Required Modules

typer>=0.12.0refextract>=0.2.5setuptools<81pdfminer.six>=20221105pypdf>=4.0httpx>=0.27.0

Components

scriptsreferencesassets

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: knowledge-paper-extractor
Download link: https://github.com/skaile-ai/ai-assets/archive/main.zip#knowledge-paper-extractor

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 510,000+ vetted skills library on demand.