Name: llama-slot-pinning
Availability: InStock
Author: crycriM

System Documentation

What problem does it solve?

This skill provides a reproducible setup for configuring the llama-server with slot pinning to persist prompt KV caches across restarts, enabling stable performance in multi-model deployments.

Core Features & Use Cases

Enables per-model slot isolation by allocating dedicated slots and independent save paths to prevent cross-model cache contamination.
Supports saving, restoring, and erasing slot states to preserve prompt context across restarts and deployments.
Provides guidance for running multiple server instances on different ports with consistent slot management.

Quick Start

Launch the llama-server with a chosen parallel slot count and a slot-save-path, then verify the available slots via the /slots endpoint.

Please help me install this Skill: Name: llama-slot-pinning Download link: https://github.com/crycriM/hermes-skills/archive/main.zip#llama-slot-pinning Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

llama-slot-pinning

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper