long-context
CommunityExtend LLM context windows, process massive documents.
Software Engineering#LLM context#ALiBi#position interpolation#RoPE#YaRN#transformer models#long documents
AuthorzechenzhangAGI
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill addresses the inherent limitation of transformer models with fixed context windows, enabling them to process and understand extremely long documents, conversations, or codebases that would otherwise be truncated.
Core Features & Use Cases
- Context Extension: Expand LLM context windows to 32k, 64k, or even 128k+ tokens, allowing models to process entire books, legal contracts, or extensive research papers.
- Efficient Positional Encodings: Implement state-of-the-art positional encoding techniques like RoPE, YaRN, and ALiBi for robust length extrapolation.
- Minimal Fine-tuning: Extend context for existing models with as few as 1000 fine-tuning steps using methods like position interpolation.
- Use Case: Analyze a 50,000-word legal contract to identify key clauses, summarize an entire research paper, or answer questions spanning multiple chapters of a book, all within a single model call.
Quick Start
Extend a Llama-2-7b-hf model's context from 2048 to 32768 tokens using linear position interpolation by setting model.config.rope_scaling to {"type": "linear", "factor": 16.0}.
Dependency Matrix
Required Modules
transformerstorcheinopsrotary-embedding-torchflash-attn
Components
references
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: long-context Download link: https://github.com/zechenzhangAGI/AI-research-SKILLs/archive/main.zip#long-context Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.