glmv-caption

Name: glmv-caption
Availability: InStock
Author: zai-org

Official

Accurate captions for images, videos, and files

Design & Creative #image #multimodal #video #document #captioning #zhipu #glm-v

Authorzai-org

Version1.0.0

Installs0

System Documentation

What problem does it solve?

Quickly obtain clear, detailed descriptions and summaries of visual and document content so users do not need to manually inspect or articulate the contents of images, videos, or files.

Core Features & Use Cases

Multimodal captioning: Generate natural-language captions for images, videos, and documents using the ZhiPu GLM-V API.
Flexible inputs: Accepts image URLs, local images encoded as base64, and file/video URLs with validation for formats and sizes.
Production-friendly output: Returns raw model outputs and token usage, supports streaming, custom prompts, and saves results to JSON for auditing.
Use Case: Content creators and accessibility teams can automatically produce image alt text, reporters can summarize video content, and researchers can extract visual summaries from batches of files.

Quick Start

Provide an image URL or upload an image and ask the skill to "Generate a detailed caption describing the visual content and salient elements."

glmv-caption

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper