glmv-caption
OfficialAccurate captions for images, videos, and files
Authorzai-org
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Quickly obtain clear, detailed descriptions and summaries of visual and document content so users do not need to manually inspect or articulate the contents of images, videos, or files.
Core Features & Use Cases
- Multimodal captioning: Generate natural-language captions for images, videos, and documents using the ZhiPu GLM-V API.
- Flexible inputs: Accepts image URLs, local images encoded as base64, and file/video URLs with validation for formats and sizes.
- Production-friendly output: Returns raw model outputs and token usage, supports streaming, custom prompts, and saves results to JSON for auditing.
- Use Case: Content creators and accessibility teams can automatically produce image alt text, reporters can summarize video content, and researchers can extract visual summaries from batches of files.
Quick Start
Provide an image URL or upload an image and ask the skill to "Generate a detailed caption describing the visual content and salient elements."
Dependency Matrix
Required Modules
requests
Components
scripts
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: glmv-caption Download link: https://github.com/zai-org/GLM-skills/archive/main.zip#glmv-caption Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.