media-tag

Community

AI-powered image/video tagging and captions.

Authordamionrashford
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This skill automates image and video tagging, captioning, and semantic classification by orchestrating multiple open-source vision-language models (CLIP, SigLIP, BLIP-2, and LLaVA) to enrich media metadata for catalogs, accessibility, and search.

Core Features & Use Cases

  • Tag images with labels from custom vocabularies or model predictions
  • Generate captions for images (WCAG-style alt text) and per-frame video narration
  • Build and query semantic search indexes over media folders to enable fast retrieval
  • Bulk-tag a directory of media files into CSV with scores for downstream cataloging

Quick Start

Tag a batch of images in a folder with open-source vision-language models and output a CSV with image, label, and score.

Dependency Matrix

Required Modules

open_clip_torchtransformerstorchsentence_transformersopencv-pythonnumpyPillow

Components

scriptsreferences

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: media-tag
Download link: https://github.com/damionrashford/media-os/archive/main.zip#media-tag

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.