Arabic ↔ Buckwalter Transliterator

Convert Arabic script to Buckwalter ASCII encoding and back

Ad placeholder (leaderboard)

The Arabic ↔ Buckwalter transliterator converts between Arabic script and the Buckwalter ASCII encoding, a strict one-to-one mapping used throughout Arabic computational linguistics. Because every Arabic letter and diacritic corresponds to exactly one ASCII character, the encoding is fully reversible — a property that makes it the standard for storing Arabic in plain-text NLP corpora.

How it works

Each Arabic Unicode code point is paired with one ASCII character. Consonants map to intuitive letters where possible: بb, تt, دd, رr, سs, عE, قq. Letters with no obvious Latin match reuse symbols and punctuation: ء (hamza)→', أ>, إ<, ذ*, ثv, حH, خx, صS, ضD, طT, ظZ, غg. Diacritics map too: fatha→a, damma→u, kasra→i, sukun→o, shadda→~, and the tanwin marks→F, N, K.

Since the mapping is a bijection, conversion in either direction is a simple character-by-character substitution. Anything not in the table — Latin letters, digits, spaces, punctuation already in ASCII — passes through unchanged.

Example

The word “كتاب” (book) becomes ktAb. The phrase “العربية” (Arabic) becomes AlErbyp. Converting ktAb back returns “كتاب” exactly. Note that A is the long alef ا while > and < are the hamza-bearing alefs.

Notes

Buckwalter is case-sensitive in the sense that s, S and other upper/lower pairs are entirely different letters — s is س (seen) while S is ص (sad). It is not meant to be readable as English; it is an exact, machine-friendly representation. Everything runs locally — your text is never uploaded.

Ad placeholder (rectangle)