Persian/Arabic Kashida Remover

Remove decorative tatweel/kashida letter stretching from Arabic/Persian text

Strips the kashida (tatweel, U+0640 ـ) character used for decorative letter elongation in Arabic and Persian typography, cleaning text for storage, search, and reliable matching without changing meaning.

What is a kashida?

A kashida, also called tatweel and encoded as U+0640 ـ, is a character that stretches the connecting line between Arabic or Persian letters. It is used to justify lines and for decorative effect, but it carries no phonetic or semantic value.

The kashida (tatweel) is a typographic character that stretches Arabic and Persian letters for justification or decoration. Because it carries no meaning, it should be stripped before text is stored, searched, or compared. This tool removes every kashida instantly.

How it works

The algorithm is a single, lossless pass: it deletes every occurrence of the tatweel character while leaving all other characters untouched.

target character:  ـ   (U+0640 ARABIC TATWEEL / kashida)
operation:         remove every occurrence
preserved:         letters, harakat, spaces, ZWNJ (U+200C), punctuation

Removing the tatweel does not break letter connections, because joining behaviour in Arabic script is driven by the surrounding letters, not by the kashida. The connecting shapes are recomputed automatically by the text renderer.

Tips and example

الـــســـلام becomes السلام — identical in meaning, now consistent for matching. Run this step before indexing user-supplied Arabic or Persian text so that decoratively justified copies de-duplicate cleanly. Keep the original for display if the visual justification matters, and use the cleaned form only as the comparison key.