What does NFKD normalisation do?

NFKD stands for Normalization Form KD, or Compatibility Decomposition. It is the most aggressive form: it folds compatibility variants such as ligatures and full-width letters and also splits precomposed characters into a base plus combining marks, with no recomposition.

How is NFKD different from NFD?

NFD only performs canonical decomposition and keeps the original characters. NFKD additionally folds compatibility variants, so the ligature fi becomes f and i and the full-width A becomes a normal A, then everything is fully decomposed.

Why would I choose NFKD over NFKC?

NFKD leaves combining marks separate instead of recomposing them, which is convenient when you want to inspect or strip individual diacritics. NFKC gives the same folding but recombines marks into precomposed characters.

Yes, it can be. Compatibility folding discards styling like superscripts and ligature glyphs, and that cannot be undone. Use NFKD for matching, search, and analysis rather than to preserve formatted text.

Does my text leave the browser?

No. The tool uses the native String.prototype.normalize function, so NFKD decomposition runs entirely in your browser and nothing is uploaded.

What is the Unicode NFKD Decomposer?

Normalise text to Unicode NFKD (Compatibility Decomposition), the most aggressive form, folding ligatures and full-width forms while splitting precomposed characters into base plus combining marks. Browser-local. It runs free in your browser on Gera Tools, with nothing uploaded.

Unicode NFKD Decomposer

Name: Unicode NFKD Decomposer
Creator: Gera Tools
License: https://creativecommons.org/licenses/by/4.0/

Get one useful tool a week

Like this tool? Enter your email and we'll send you one genuinely useful Gera tool a week — plus a link to come back to this one. No spam, one-click unsubscribe any time.

NFKD (Normalization Form KD) is the most thorough Unicode normalisation: it folds compatibility variants like ligatures and full-width letters and then fully decomposes precomposed characters into base letters plus combining marks. The result exposes every individual element, which is ideal for analysis, matching, and accent stripping. This tool applies NFKD and lists the code points before and after.

How it works

NFKD is a compatibility decomposition with no recomposition step:

ﬁ  ->  f i         (fi ligature → two separate letters)
Ａ  ->  A           (full-width A → ASCII A)
é   ->  e + ◌́       (precomposed U+00E9 → U+0065 U+0301)
²   ->  2           (superscript two → digit 2)
ﬃ  ->  f f i       (ffi ligature → three letters)

Combining marks are placed in canonical order so equivalent inputs always produce the same output. Because compatibility variants are folded and precomposed characters are split, the code point count usually grows. The tool uses the engine’s native String.prototype.normalize("NFKD").

The four Unicode normalization forms compared

Unicode defines four forms; understanding when to choose NFKD requires knowing what the others do:

Form	Canonical decomposition	Recomposition	Compatibility folding
NFC	Yes	Yes	No
NFD	Yes	No	No
NFKC	Yes	Yes	Yes
NFKD	Yes	No	Yes

NFD decomposes é into e + combining accent but leaves the fi ligature intact. NFKD does both: it decomposes é and also breaks the ligature into f i. That makes NFKD the right choice when you need to catch all possible representations of “fi” in a search, but the wrong choice when you need to preserve the ligature for display purposes.

The canonical accent-stripping pipeline

NFKD is the standard first step for accent-insensitive text matching and normalization for databases. The typical pipeline is:

Apply NFKD — decompose all precomposed characters and fold compatibility variants
Filter — remove every code point in the combining diacritical marks block (U+0300–U+036F)
The result is ASCII-safe base letters

For example:

café → NFKD → café → strip marks → cafe
ﬁancé → NFKD → fiancé → strip marks → fiance
Ångström → NFKD → Angström → strip marks → Angstrom

This is widely used in search engines, username normalization, and database deduplication to prevent the same word in different encodings from appearing as separate entries.

When NFKD is lossy — and when that matters

NFKD discards style information that cannot be recovered:

The fi ligature becomes two letters — if a printed document relied on the ligature for typographic style, that is lost
Superscript ² becomes 2 — mathematical notation collapses
Full-width Latin characters (common in CJK documents) become ASCII equivalents

This is intentional for matching and search, but it means NFKD is not appropriate for storing or displaying text where those distinctions carry meaning. Use it for comparison and analysis; store the original form.

For the same folding but with marks recombined afterward, use NFKC — it gives you cafe from café in a single precomposed form rather than leaving the base+combining-mark pair.