classify-publisher-patterns

Community

Detect publisher tech and classify layouts.

Authortherealityreport
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Detects publisher pages by identifying the technology stack and mapping layout patterns to a standardized 15-section brand taxonomy to guide pre-extraction decisions and inventory generation.

Core Features & Use Cases

  • Technology detection: Inventory frameworks, CDNs, analytics, and CSS techniques from page HTML.
  • Layout family classification: Assign the page to a layout family (nyt-interactive, nyt-article, athletic-article, generic-publisher).
  • Taxonomy mapping: Produce a PublisherClassification object describing section-level element mappings for downstream pipelines.
  • Use case: Inform which extractors to invoke before the extraction wave and generate a Dev Stack inventory entry for the design-docs pipeline.

Quick Start

Run the classify-publisher-patterns script with the article URL and path to the source HTML to obtain the PublisherClassification.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: classify-publisher-patterns
Download link: https://github.com/therealityreport/trr-app/archive/main.zip#classify-publisher-patterns

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.