UTF-32 Hex Viewer

Show each Unicode code point as a 4-byte UTF-32 hex value

Ad placeholder (leaderboard)

UTF-32 is the simplest Unicode encoding: every code point is stored as a fixed four-byte value. This viewer expands any text into that fixed-width byte stream in hexadecimal, in either little-endian or big-endian order.

How it works

The tool iterates over the string by Unicode code point (using the string iterator, which correctly joins surrogate pairs into a single value above U+FFFF). Each code point cp is split into four bytes:

b0 =  cp        & 0xFF   (least significant)
b1 = (cp >> 8)  & 0xFF
b2 = (cp >> 16) & 0xFF
b3 = (cp >> 24) & 0xFF
LE = b0 b1 b2 b3
BE = b3 b2 b1 b0

Because the value never exceeds U+10FFFF, the most significant byte is always 00, but UTF-32 still reserves all four bytes for alignment and fixed indexing.

Example and notes

The letter A (U+0041) becomes 41 00 00 00 in little-endian or 00 00 00 41 in big-endian. A rocket emoji (U+1F680) becomes the single value 80 F6 01 00 in little-endian — one code point, four bytes — whereas in UTF-16 the same character would need a two-unit surrogate pair. UTF-32 is convenient for indexing because character count always equals byte count divided by four.

Ad placeholder (rectangle)