nvidia-vlm

Community

AI-powered vision language model skill.

AuthorZenodia
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill empowers AI agents to understand and interpret visual information from images, bridging the gap between visual data and natural language processing.

Core Features & Use Cases

  • Image Analysis: Analyze images to understand their content, scene, and context.
  • Detailed Descriptions: Generate comprehensive descriptions of images, including objects, colors, and text.
  • OCR: Extract text from images for data processing and accessibility.
  • Visual Question Answering: Answer specific questions about image content.
  • Use Case: Upload a photo of a street scene and ask the AI to identify all the vehicles and their colors, or to read any signs present.

Quick Start

Use the nvidia-vlm skill to analyze the image located at 'path/to/your/image.jpg'.

Dependency Matrix

Required Modules

openaiPyYAMLlangchainpydanticPillow

Components

scriptsassets

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: nvidia-vlm
Download link: https://github.com/Zenodia/agentic-context-engineering-optimization/archive/main.zip#nvidia-vlm

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.