databricks-parsing

Community

Parse documents with AI

Authorslysik
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill automates the extraction of text and structured data from various document types (PDF, DOCX, PPTX, images), enabling efficient data processing and the foundation for custom RAG pipelines.

Core Features & Use Cases

  • Document Parsing: Use ai_parse_document to convert binary documents into structured text.
  • Data Extraction: Extract specific fields from documents using ai_query in conjunction with parsed text.
  • RAG Pipeline Foundation: Parse, chunk, and index documents for advanced search and analysis.
  • Use Case: Ingesting and processing a collection of research papers stored in a Databricks Volume to build a searchable knowledge base.

Quick Start

Parse all PDF and DOCX documents in the '/Volumes/catalog/schema/volume/docs/' directory using the ai_parse_document function.

Dependency Matrix

Required Modules

None required

Components

scriptsreferences

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: databricks-parsing
Download link: https://github.com/slysik/databricks-claude-coding/archive/main.zip#databricks-parsing

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.