Arabic Diacritic Density Meter

Show what percentage of words in Arabic text carry tashkeel

Measures the proportion of vowel-marked (tashkeel) words to unvowelled words in Arabic text, helping you judge whether a passage suits learners or native readers. Runs entirely in your browser.

What is tashkeel?

Tashkeel, also called harakat, are the small diacritical marks written above and below Arabic letters to indicate short vowels, gemination (shadda), and the absence of a vowel (sukun). Most everyday Arabic omits them and relies on the reader to supply vowels from context.

The Arabic Diacritic Density Meter reports what share of words in a passage carry tashkeel (harakat). Because fully vocalised Arabic is mainly used for learners, religious texts, and disambiguation, the density figure is a quick signal of how a text is intended to be read.

How it works

The text is split into words. For each word the tool checks whether it contains any Arabic diacritic codepoint:

fatha   ◌َ  U+064E     kasra   ◌ِ  U+0650
damma   ◌ُ  U+064F     sukun   ◌ْ  U+0652
shadda  ◌ّ  U+0651     tanwin  ◌ً ◌ٍ ◌ٌ  U+064B–U+064D
plus superscript alef U+0670 and related marks

A word is classed as vocalised if it carries at least one of these marks, and the density is the count of vocalised words divided by the total word count. The tool also reports the raw diacritic-mark count and average marks per word.

Tips and notes

A density near 100% indicates fully pointed text, as in the Quran or graded learner material. A density near 0% is normal for newspapers, novels, and most native adult reading, where vowels are inferred from context. Partial density often marks selective pointing, where only ambiguous or unusual words are vocalised. Use the figure together with the per-word average: a low word percentage but high mark count can mean a few heavily pointed words rather than evenly distributed marks.