webcrawler

Official

Harvest docs into offline knowledge bases.

Authornarduk-enterprises
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Documentation sites are often dispersed and hard to index for offline use. This skill crawls structured documentation portals and converts pages into clean Markdown with metadata to enable offline access, search, and integration with RAG workflows.

Core Features & Use Cases

  • Recursive, depth-bounded crawling of documentation sites (ReadTheDocs, GitBook, Docusaurus, MkDocs) to surface relevant content.
  • HTML-to-Markdown extraction with optional code blocks preservation and source attribution.
  • Output an organized corpus with per-page Markdown files, a master index, and machine-readable metadata suitable for embedding.

Quick Start

Crawl the docs you care about and generate a local knowledge base with Markdown files and a navigable index.

Dependency Matrix

Required Modules

requestsbeautifulsoup4html2text

Components

scriptsreferences

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: webcrawler
Download link: https://github.com/narduk-enterprises/myboat/archive/main.zip#webcrawler

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.