Urdu RTL Direction Fixer

Fix Urdu Nastaliq direction in mixed LTR/RTL digital contexts

Insert Unicode bidi control characters around Urdu Nastaliq runs so they render right-to-left when embedded inside left-to-right HTML, using isolate, embed, or mark methods. Keyless browser tool.

Why does mixed Urdu and English text render wrong?

The Unicode bidirectional algorithm guesses direction from surrounding characters. Inside a left-to-right context, neutral characters like punctuation and digits next to an Urdu run can attach to the wrong side, scrambling the visual order. Explicit bidi controls fix this.

Urdu mixed into left-to-right contexts often renders with its words or punctuation out of order. The cause is the Unicode bidirectional algorithm guessing wrong about neutral characters. This tool inserts the explicit control characters that force correct direction.

How it works

The text is scanned for maximal runs of strong right-to-left characters — the Arabic-script blocks that Urdu uses, including the presentation-forms ranges. Each such run, together with the neutral punctuation and spaces inside it, is wrapped in a bidi control pair chosen by the selected method:

isolate:  ⟨RLI⟩ run ⟨PDI⟩   (U+2067 … U+2069)  recommended
embed:    ⟨RLE⟩ run ⟨PDF⟩   (U+202B … U+202C)  legacy
mark:     ⟨RLM⟩ run ⟨RLM⟩   (U+200F)           lightweight

The controls are invisible at render time but instruct the layout engine to treat the wrapped run as right-to-left. A second box visualizes the inserted characters with bracketed labels so you can verify the result.

Example and notes

In The book کتاب is on the میز table., the two Urdu words are isolated so they display correctly without dragging the surrounding English words around. The isolate method is preferred for new content because it prevents one wrapped run from affecting the direction of text after it — a known failure mode of the older embed controls. Use the lightweight mark method only when a platform strips the heavier controls.