databricks-parsing
CommunityParse documents with AI
Authorslysik
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill automates the extraction of text and structured data from various document types (PDF, DOCX, PPTX, images), enabling efficient data processing and the foundation for custom RAG pipelines.
Core Features & Use Cases
- Document Parsing: Use
ai_parse_documentto convert binary documents into structured text. - Data Extraction: Extract specific fields from documents using
ai_queryin conjunction with parsed text. - RAG Pipeline Foundation: Parse, chunk, and index documents for advanced search and analysis.
- Use Case: Ingesting and processing a collection of research papers stored in a Databricks Volume to build a searchable knowledge base.
Quick Start
Parse all PDF and DOCX documents in the '/Volumes/catalog/schema/volume/docs/' directory using the ai_parse_document function.
Dependency Matrix
Required Modules
None requiredComponents
scriptsreferences
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: databricks-parsing Download link: https://github.com/slysik/databricks-claude-coding/archive/main.zip#databricks-parsing Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.