agentsop-idempotent-ingestion

Community

Make RAG ingestion safe to re-run—no duplicates.

Authoragentsope
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill prevents duplicated, corrupted, or stale vector indexes caused by ingestion pipelines that are re-run over changing corpora without a persisted correctness ledger. It helps coder-agents guarantee that a second ingestion run over unchanged documents becomes a no-op, and that deletions are handled intentionally.

Core Features & Use Cases

  • Idempotent ingestion contract: derives insert/update/skip decisions from a stable document content hash stored in a persisted docstore ledger.
  • Delete propagation correctness: explains why deletes do not happen “for free” and when to use snapshot-based vs incremental cleanup strategies.
  • Twice-run regression gate: provides a concrete test that proves the second run processes zero nodes and does not change vector counts.
  • Cross-framework mapping: aligns LlamaIndex (IngestionPipeline + docstore/docstore_strategy) and LangChain (index + RecordManager + cleanup modes) plus a manual hash-ledger equivalent.

Quick Start

Tell the coder-agent to “use agentsop-idempotent-ingestion to review my ingestion pipeline so re-indexing over a changing corpus stays idempotent, includes delete handling, and contains a twice-run no-op test in CI.”

Dependency Matrix

Required Modules

None required

Components

references

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: agentsop-idempotent-ingestion
Download link: https://github.com/agentsope/SkillAlchemy/archive/main.zip#agentsop-idempotent-ingestion

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.