token-morph-metrics

Community

Deep analysis of Arabic morphological tokenization metrics.

AuthorchabirOael
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill provides comprehensive metrics to evaluate Arabic tokenizers' morphological fidelity and boundary accuracy, facilitating fair comparisons and improvements.

Core Features & Use Cases

  • Intrinsic Metrics Computation: Calculates root, pattern, morpheme integrity, clitic separation, and fragmentation ratios.
  • Tokenizer Assessment: Supports analyzing various tokenizer architectures, including subword, character, and byte-level models.
  • Use Case: Researchers can determine how well a tokenizer preserves Arabic morphology without running downstream tasks or external models.

Quick Start

Run the analysis script with your tokenizer output files to obtain the morphological metric scores and insights.

Dependency Matrix

Required Modules

None required

Components

scriptsreferences

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: token-morph-metrics
Download link: https://github.com/chabirOael/tokenizers_evaluation/archive/main.zip#token-morph-metrics

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.