fleet-inference

Community

Access and query models across local and cloud fleets seamlessly.

Authorphyter1
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill enables direct querying of any model within a fleet, whether local or cloud-based, reducing latency and simplifying model access for AI tasks.

Core Features & Use Cases

  • Model Discovery and Access: Listed models can be retrieved across various providers like MLX, Ollama, Cerebras, Groq, Gemini, and OpenRouter.
  • Flexible Querying: Send prompts to specific models or allow auto-routing to the best available provider for tasks such as inference, analysis, and generation.
  • Use Case: Easily run large language model inferences on the local network or cloud, such as generating content or analyzing data without manual switching between endpoints.

Quick Start

Query models by specifying a prompt or list all available models, enabling quick integration into AI workflows.

Dependency Matrix

Required Modules

None required

Components

scriptsreferences

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: fleet-inference
Download link: https://github.com/phyter1/seed/archive/main.zip#fleet-inference

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.