Name: V-JEPA 2 Vision Transformer
Availability: InStock
Author: sovr610

System Documentation

What problem does it solve?

This Skill standardizes how to implement and probe the V-JEPA 2 Vision Transformer across image and video tasks, reducing integration friction and providing a repeatable workflow.

Core Features & Use Cases

ViT variants coverage (Tiny to Gigantic) with 2D and 3D patch embeddings
RoPE-based attention, Cross-Attention, and AttentivePooler for downstream probing
Positional embeddings interpolation, token masking, and activation checkpointing for large models

Quick Start

Instantiate a small ViT variant from the config factory and run a forward pass on a sample image to verify shapes.

Please help me install this Skill: Name: V-JEPA 2 Vision Transformer Download link: https://github.com/sovr610/refffiy/archive/main.zip#v-jepa-2-vision-transformer Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

V-JEPA 2 Vision Transformer

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper