Does the tool count diacritics (tashkeel)?

No. Vowel marks (harakat such as fatha, kasra, damma, sukun and shadda), the superscript alef, Quranic annotation marks, and the tatweel/kashida stretching character are all stripped before counting. Only the base consonant and long-vowel letters are tallied, which is the standard basis for frequency analysis.

What does the hamza normalisation option do?

When enabled it merges the hamza-carrier variants onto their base letters — أ, إ, آ all become ا; ؤ becomes و; ئ and ى become ي; and ة (teh marbuta) becomes ه. This matches the usual cryptanalytic convention so that, for example, all forms of alef are tallied together rather than scattered across several rows.

Why is letter frequency useful in Arabic?

In Arabic the letter alef (ا) and lam (ل) are by far the most common, partly because of the definite article ال. Frequency profiles like this are the backbone of classical cryptanalysis — the field itself was pioneered by Arab scholars such as al-Kindi — and are also used in linguistics, typography, and keyboard design.

Does it handle right-to-left text correctly?

Yes. The input box and the letter column are rendered right-to-left so Arabic displays naturally. Counting is direction-agnostic — it tallies code points — so the order of display never affects the totals.

Is my text sent to a server?

No. All stripping, normalisation, and counting happen locally in your browser. Nothing is uploaded, logged, or stored, so you can analyse private or sensitive documents safely.

What is the Arabic Letter Frequency Counter?

Free Arabic letter frequency counter for cryptanalysis and linguistics. Paste Arabic text and rank every letter by frequency, ignoring tashkeel, with optional normalisation of hamza forms (أ إ آ → ا) and ة → ه. Runs in your browser. It runs free in your browser on Gera Tools, with nothing uploaded.

Arabic Letter Frequency Counter

Name: Arabic Letter Frequency Counter
Creator: Gera Tools
License: https://creativecommons.org/licenses/by/4.0/

Get one useful tool a week

Like this tool? Enter your email and we'll send you one genuinely useful Gera tool a week — plus a link to come back to this one. No spam, one-click unsubscribe any time.

This Arabic letter frequency counter ranks every letter in a passage of Arabic by how often it occurs, the classic first step in cryptanalysis and a useful lens for linguistics, typography, and keyboard layout. It automatically strips vowel marks and the tatweel so the counts reflect base letters only, and it offers an optional normalisation pass that folds the various hamza carriers and teh marbuta onto their base forms — exactly the convention used when analysing classical ciphers.

How it works

The tool first removes all tashkeel — the harakat ً ٌ ٍ َ ُ ِ ّ ْ, the superscript alef, the Quranic annotation marks, and the tatweel/kashida ـ used only to stretch words. What remains is the run of base letters in the ranges U+0621–U+064A and the extended Arabic letters. Each of those letters is counted, and any character that is not an Arabic letter (spaces, Latin text, digits, punctuation) is ignored.

When hamza normalisation is on, the tool maps the carrier variants to their base before tallying: أ إ آ ٱ → ا, ؤ → و, ئ ى → ي, and ة → ه. This is important for frequency analysis, because otherwise the many spellings of alef would each form their own row and understate how dominant alef really is. Letters are finally sorted from most to least frequent, with each shown as a raw count and a percentage of the total, computed as count / total × 100.

Example and tips

In almost any Arabic text the top of the list is led by alef (ا) and lam (ل), largely because the definite article ال is so pervasive, followed by mim, waw, ya, and nun. That stable signature is exactly what makes monoalphabetic substitution ciphers breakable: match the most common ciphertext symbols against this expected ordering and the plaintext starts to fall into place.

Turn normalisation on for cryptanalysis and corpus statistics, where you want all alef forms grouped; turn it off when you care about exact orthography, such as proofreading hamza placement. Because every step runs in your browser, sensitive documents never leave your device.