Name: google-gemini-media
Availability: InStock
Author: ShenWang96

System Documentation

What problem does it solve?

Managing and orchestrating Gemini's image, video, and audio capabilities across generation and understanding is complex; this Skill consolidates them into reusable workflows and templates for end-to-end media production and analysis.

Core Features & Use Cases

Image generation (Nano Banana) to produce high-fidelity visuals from prompts.
Image understanding (captioning, VQA, classification, multi-image prompts) to extract insights and metadata.
Video generation (Veo 3.1) to create short-form content with optional audio.
Video understanding (analyze YouTube/direct uploads) to summarize and extract timestamps.
Speech generation (TTS) for controllable narration.
Audio understanding (transcription, description, token counting) for media insights.

Quick Start

Provide a Gemini-based prompt and use the included templates to generate media assets and corresponding understanding outputs.

Please help me install this Skill: Name: google-gemini-media Download link: https://github.com/ShenWang96/clawdbot_workspace_backup/archive/main.zip#google-gemini-media Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

google-gemini-media

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper