gemini-vision

Community

Unlock image insights with Gemini Vision API.

Authoreinverne
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Extracting meaningful information from images and documents often requires specialized AI models and complex setup, making it difficult for developers to integrate advanced vision capabilities. This skill provides a streamlined way to leverage Google Gemini's advanced vision capabilities for tasks like image understanding, object detection, and document analysis, simplifying AI integration.

Core Features & Use Cases

  • Comprehensive Image Analysis: Generate descriptive captions, classify content, answer visual questions, and compare multiple images for nuanced insights.
  • Advanced Visual AI: Utilize object detection (bounding boxes) and segmentation (pixel-level masks) for precise visual understanding and detailed analysis (model-dependent).
  • Document Understanding: Process PDF documents, extracting text and analyzing visual elements within them, supporting up to 1,000 pages for large reports or contracts.
  • Flexible Input & Models: Supports various image formats (PNG, JPEG, PDF) and allows selection of Gemini models (Pro, Flash, Lite) based on speed, capability, and cost requirements.
  • Use Case: Automatically identify and count specific objects in a series of inspection photos, extract key data points from scanned invoices, or generate detailed descriptions for product images in an e-commerce catalog.

Quick Start

Use the gemini-vision skill to describe the image located at 'path/to/my_image.jpg'.

Dependency Matrix

Required Modules

google-genairequests

Components

scriptsreferences

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: gemini-vision
Download link: https://github.com/einverne/dotfiles/archive/main.zip#gemini-vision

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.