Name: qianfanocr-document-intelligence
Availability: InStock
Author: baidubce

System Documentation

What problem does it solve?

Analyze images, image URLs, PDFs, and PDF URLs to enable recognition, extraction, and answering questions about content from visual inputs. It coordinates token setup, mode selection, and downstream tooling to produce structured results for agents.

Core Features & Use Cases

Supports multiple input types (images and PDFs) and per-page outputs, including layout-aware parsing to preserve structure.
Provides modes for document parsing, layout analysis, element recognition, document parsing with layout, general OCR, key information extraction, chart understanding, and doc vqa, with references and assets loaded as needed.
Use Case: automate extraction of key fields from documents (invoices, contracts) and generate structured data for downstream automation.

Quick Start

Provide an image or PDF and the skill will orchestrate OCR and document understanding to return a structured result.

Please help me install this Skill: Name: qianfanocr-document-intelligence Download link: https://github.com/baidubce/skills/archive/main.zip#qianfanocr-document-intelligence Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

qianfanocr-document-intelligence

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper