ingest-web

Name: ingest-web
Availability: InStock
Author: RonanCodes

Community

Convert web articles into clean markdown

Content & Communication #metadata #vault #web-scraping #ingest #html-to-markdown #web-extraction #image-downloading

AuthorRonanCodes

Version1.0.0

Installs0

System Documentation

What problem does it solve?

Web articles and blog posts often contain cluttered HTML, missing metadata, remote image links, or are difficult to import cleanly into a markdown-based wiki; this Skill automates extracting readable content and packaging it for ingestion into a vault.

Core Features & Use Cases

Readable extraction: Fetches a URL and extracts the article title, author, published date, and main body while stripping navigation, sidebars, and ads.
HTML-to-markdown conversion: Preserves headings, lists, blockquotes, links, code blocks, and image references when converting to clean markdown.
Image and asset handling: Downloads referenced images into vault/raw/assets, replaces remote URLs with local paths, and records images-downloaded in the file frontmatter.
Metadata-first output: Writes a YAML frontmatter including source-url, title, author, date-fetched, and images-downloaded, saving results to raw/<descriptive-slug>.md for downstream wiki workflows.
Fallback guidance: Notes when extraction is likely to fail (heavy JS/SPAs) and recommends using a browser clipper for better fidelity.

Quick Start

Ingest the article at https://example.com/article into vault my-research to extract content, download images, and save a markdown file in raw/.

ingest-web

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper