Name: text-embeddings-inference
Availability: InStock
Author: jayll1303

System Documentation

What problem does it solve?

Deploy and operate a production-capable inference service for text embeddings, re-ranking, and sequence classification so teams can power semantic search and RAG pipelines without relying on external APIs.

Core Features & Use Cases

HuggingFace TEI Deployment: Guidance for launching TEI Docker images matched to GPU/CPU architectures and CUDA compute capability.
Embedding & Re-ranking APIs: Use OpenAI-compatible embedding endpoints, batch requests, and re-ranker endpoints for retrieval quality improvements.
Performance, Monitoring & Air-gapped Support: Tune batching and concurrency for throughput, export Prometheus metrics and OpenTelemetry, and run fully offline with mounted model directories.
Use Case: Run a local TEI server to produce vectors for a RAG pipeline, tune batch settings for high throughput, and integrate with a vector store for semantic search.

Quick Start

Deploy and launch a local TEI Docker instance for embeddings and test it with a sample input.

Please help me install this Skill: Name: text-embeddings-inference Download link: https://github.com/jayll1303/AIEKit/archive/main.zip#text-embeddings-inference Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

text-embeddings-inference

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper