Why count diacritics separately?

Diacritics (aerab) such as zabar, zer, and pesh are combining marks that sit on a base letter and do not occupy their own visible cell. Counting them apart gives the visible letter count most editors and SMS limits care about.

What counts as an Urdu-only letter?

Letters that exist in Urdu but not in standard Arabic, such as ٹ (tte), ڈ (ddal), ڑ (rre), ں (noon ghunna), ہ (gol he), گ (gaf) and پ (peh). The tool flags these separately from the Arabic-script letters Urdu shares.

Does it handle zero-width characters?

Yes. Zero-width non-joiners used between compound-word parts are counted in their own row so you can see how many are present without them inflating your visible character count expectations.

Is the total the same as JavaScript string length?

It iterates by Unicode code point, so characters outside the basic plane count as one each rather than as surrogate pairs. For Urdu this matches the visual code-point count.

Is my text uploaded anywhere?

No. All counting happens in the browser as you type. Nothing is sent to a server, logged, or stored.

Urdu Character Counter

Email me this result

Get this tool's output sent to your inbox, plus one useful tool a week. No spam, unsubscribe any time.

Urdu is written in the Nastaliq style of the Arabic script but adds its own letters and layers optional diacritics on top of base letters. A naive character count mixes all of these together. This counter breaks the text down into the categories that actually matter.

How it works

The text is iterated one Unicode code point at a time. Each code point is sorted into a bucket:

Combining diacritics (harakat / aerab) in ranges like U+064B–U+065F and the superscript alef U+0670 are counted as diacritics.
A curated set of Urdu-only letters (ٹ ڈ ڑ ں ہ ھ ے گ پ چ ژ) is counted as Urdu-specific.
Any other Arabic-block letter (U+0600–U+06FF) is counted as shared.

The headline figure, characters excluding diacritics, is the total code points minus the diacritics — the count that corresponds to the visible base letters.

Example and notes

For اردو ٹھیک ہے the counter reports the visible letters separately from any aerab you add, and flags ٹ and ہ as Urdu-only. Note that the gol he (ہ) and do-chashmi he (ھ) are distinct code points used for different sounds, so both are treated as Urdu-specific. If you are checking an SMS or username length limit, use the diacritic-excluded count, since most systems measure base characters.