glmocr-sdk

Name: glmocr-sdk
Availability: InStock
Author: zai-org

Official

Instant structured extraction from images and PDFs

Data & Analytics #ocr #pdf #image #data-extraction #table-extraction #layout-detection #glmocr

Authorzai-org

Version1.0.0

Installs0

System Documentation

What problem does it solve?

This Skill turns scanned pages, screenshots, and PDFs into machine-readable outputs so agents and pipelines can extract text, tables, formulas, and layout regions without manual copying or rekeying.

Core Features & Use Cases

OCR with layout awareness: Produces labeled regions (title, text, table, formula, figure, etc.) with normalized bounding boxes on a 0–1000 scale.
Dual interfaces: Works as a one-line Python API or a CLI for batch processing, stdout-first outputs, and agent-friendly piping to tools like jq.
Rich serialization and visualization: Exports JSON and Markdown, saves cropped images and optional layout visualizations, and supports MaaS/cloud or selfhosted modes for different deployment needs.
Use Case: Convert a multi-page research paper or a folder of invoice scans into structured JSON for downstream analytics and summarization.

Quick Start

Call the glmocr CLI or the Python API to parse 'document.pdf' and return JSON regions plus a Markdown version.

glmocr-sdk

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper