webcrawler
OfficialHarvest docs into offline knowledge bases.
Authornarduk-enterprises
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Documentation sites are often dispersed and hard to index for offline use. This skill crawls structured documentation portals and converts pages into clean Markdown with metadata to enable offline access, search, and integration with RAG workflows.
Core Features & Use Cases
- Recursive, depth-bounded crawling of documentation sites (ReadTheDocs, GitBook, Docusaurus, MkDocs) to surface relevant content.
- HTML-to-Markdown extraction with optional code blocks preservation and source attribution.
- Output an organized corpus with per-page Markdown files, a master index, and machine-readable metadata suitable for embedding.
Quick Start
Crawl the docs you care about and generate a local knowledge base with Markdown files and a navigable index.
Dependency Matrix
Required Modules
requestsbeautifulsoup4html2text
Components
scriptsreferences
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: webcrawler Download link: https://github.com/narduk-enterprises/myboat/archive/main.zip#webcrawler Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.