agentsop-idempotent-ingestion
CommunityMake RAG ingestion safe to re-run—no duplicates.
System Documentation
What problem does it solve?
This Skill prevents duplicated, corrupted, or stale vector indexes caused by ingestion pipelines that are re-run over changing corpora without a persisted correctness ledger. It helps coder-agents guarantee that a second ingestion run over unchanged documents becomes a no-op, and that deletions are handled intentionally.
Core Features & Use Cases
- Idempotent ingestion contract: derives insert/update/skip decisions from a stable document content hash stored in a persisted docstore ledger.
- Delete propagation correctness: explains why deletes do not happen “for free” and when to use snapshot-based vs incremental cleanup strategies.
- Twice-run regression gate: provides a concrete test that proves the second run processes zero nodes and does not change vector counts.
- Cross-framework mapping: aligns LlamaIndex (IngestionPipeline + docstore/docstore_strategy) and LangChain (index + RecordManager + cleanup modes) plus a manual hash-ledger equivalent.
Quick Start
Tell the coder-agent to “use agentsop-idempotent-ingestion to review my ingestion pipeline so re-indexing over a changing corpus stays idempotent, includes delete handling, and contains a twice-run no-op test in CI.”
Dependency Matrix
Required Modules
None requiredComponents
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: agentsop-idempotent-ingestion Download link: https://github.com/agentsope/SkillAlchemy/archive/main.zip#agentsop-idempotent-ingestion Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.