Arabic Alef Normalizer

Normalize all alef variants (أ إ آ ا) to bare alef for search/matching

Converts every Arabic alef variant including alef-hamza, alef-madda and alef-wasla to plain alef ا for case-insensitive search normalization, with optional ya/ta-marbuta folding and tashkeel stripping.

Why normalise alef forms?

Arabic has several alef variants — أ إ آ ٱ — that users type inconsistently or omit the hamza from entirely. Folding them all to the bare alef ا lets a search for احمد match أحمد, إحمد, and آحمد without depending on exact spelling.

Arabic alef comes in several forms that users type inconsistently, which breaks naive text matching. This normaliser folds every alef variant to the bare alef ا so that searches succeed regardless of which alef was typed.

How it works

The tool replaces each alef variant with the plain alef, then optionally applies the wider folding that search engines commonly use:

أ إ آ ٱ ٲ ٳ ٵ   ->  ا   (all alef variants to bare alef)
ى  ->  ي           (optional: alef-maksura to ya)
ة  ->  ه           (optional: ta-marbuta to ha)
+ optionally strip tashkeel (harakat / combining marks)

The first rule is the core: regardless of hamza, madda, or wasla, the result is one canonical alef. The extra ya/ta-marbuta folds and the tashkeel strip mirror what production Arabic search indexes do, so enabling them maximises recall.

Tips and example

أحمد and احمد both normalise to احمد, so a user who omits the hamza still finds the name. Apply the exact same options when building your index and when handling each query — if the stored term is folded but the query is not, they will not compare equal. For display text keep the original; only the search key needs to be normalised.