This tool diagnoses and fixes Unicode encoding problems in Vietnamese text. Vietnamese diacritics can be stored two ways, and a mismatch causes accents to float over letters, search to miss results, and duplicate-looking strings to compare as unequal.
How it works
The tool normalizes your input both ways and compares it against the original to
detect its current form. NFC (precomposed) packs each accented letter into a
single code point, such as ế at U+1EBF. NFD (decomposed) stores a plain e
followed by combining circumflex and acute marks. By counting code points and
combining marks, the tool can tell NFC from NFD from mixed text, then offers both
converted forms to copy.
Example and notes
The string Tiếng Việt looks identical in both forms but takes more code points
in NFD because each accented vowel splits into a base letter plus marks. If you
paste text and the detected form is NFD or mixed, convert it to NFC for storage
and display — that is what databases, URLs, and most fonts expect. Reach for NFD
only when a specific system demands decomposed input.