From text to code points
This tool breaks a string into the individual Unicode code points behind it. Where a byte-counter cares about storage and an editor cares about glyphs, this view shows the logical characters Unicode actually assigns numbers to — useful when debugging encoding issues, building escape sequences, or inspecting mysterious whitespace.
How it works
JavaScript strings are stored as UTF-16, so a naive character loop would split astral characters (anything above U+FFFF, like most emoji) into two surrogate halves. To avoid that, the tool iterates the string with a code-point-aware loop and reads each character’s codePointAt(0). Each value is then formatted in the notation you choose:
U+XXXX -> "U+" + hex, upper-case, zero-padded to 4 digits
0x form -> "0x" + hex
decimal -> the plain integer
For example the earth emoji is the single code point U+1F30D, not the surrogate pair U+D83C U+DF0D that UTF-16 would store it as.
Tips and notes
If a single visible symbol reports as several code points, it is almost certainly a base letter plus combining diacritical marks, or an emoji built from a zero-width-joiner sequence — both are legitimate multi-code-point clusters. Use the U+XXXX notation when writing documentation or looking characters up in the Unicode charts, and decimal when feeding an API that wants integers. To go the other way, use the companion code-points-to-UTF-8 tool.