Which input formats are accepted?

You can mix U+1F30D, 0x41, backslash-u escapes and bare decimal numbers in one list, separated by spaces or commas. Tokens that contain hex letters or a hex prefix are read as hexadecimal; all-digit tokens are read as decimal.

How is a code point turned into characters?

Each numeric code point is passed to the standard String.fromCodePoint, which produces the correct UTF-8-representable character, including emoji and other characters above U+FFFF that need surrogate pairs internally.

Why are some values rejected?

Values above U+10FFFF are outside the Unicode range, and values in U+D800 to U+DFFF are surrogate halves that are not valid standalone characters. Both are flagged as errors rather than producing broken output.

Can I paste a bare hex list without prefixes?

Yes if the values contain hex letters they are detected as hex automatically. Ambiguous all-digit tokens are treated as decimal, so prefix them with 0x or U+ if you mean hex.

Is this the reverse of the code-point exploder?

Exactly. The UTF-8-to-code-points tool decomposes text into numbers; this tool recomposes those numbers back into text, making the two a matched encode/decode pair.

What is the Unicode Code Points to UTF-8?

Turn a list of Unicode code points — written as U+XXXX, 0x, backslash-u or plain decimal — back into readable UTF-8 text. Mixes notations freely and rejects invalid surrogate or out-of-range values. Free, instant, browser-based. It runs free in your browser on Gera Tools, with nothing uploaded.

Unicode Code Points to UTF-8

Name: Unicode Code Points to UTF-8
Creator: Gera Tools
License: https://creativecommons.org/licenses/by/4.0/

Get one useful tool a week

Like this tool? Enter your email and we'll send you one genuinely useful Gera tool a week — plus a link to come back to this one. No spam, one-click unsubscribe any time.

From code points back to text

This tool is the inverse of a code-point exploder: give it the numeric Unicode code points and it reassembles the original characters. It is handy when a log, API response or escape sequence gives you raw code-point numbers and you want to see what they actually spell.

When you need this tool

Several situations produce raw code-point lists instead of readable text:

Debug logs and crash dumps — some runtimes print strings as decimal or hex code points when the encoding is not clear.
Protocol buffers / binary diffs — tools often annotate bytes with decimal code-point values alongside the raw bytes.
Language specs and reference tables — Unicode character charts list entries as U+XXXX; paste a column of them here to see what they look like.
Regex engine traces — some engines report matching positions in code points, not bytes.
Data migrations — older databases may export text as CSV columns of decimal integers when the column type was misconfigured.

How it works

The input is split on spaces and commas into tokens, and each token is classified:

U+, 0x, \u, \U prefix      -> hexadecimal
contains a-f letters       -> hexadecimal
all decimal digits         -> decimal

The token is parsed to an integer in the right base, validated against the Unicode range (U+0000 to U+10FFFF) and against the surrogate gap (U+D800 to U+DFFF, which are not real characters), then passed to String.fromCodePoint. That builtin correctly emits a single character even for astral code points that JavaScript must store internally as a UTF-16 surrogate pair.

Worked example

Suppose a log entry prints: 72 101 108 108 111 32 240 159 140 141

These are the decimal code points for the ASCII letters H-e-l-l-o, a space, and then the Unicode character U+1F30D (Earth Globe Europe-Africa emoji, code point 127757 in decimal). Paste the ten numbers in and the tool produces Hello 🌍.

Notice that the emoji uses a code point above U+FFFF — an “astral” code point. The tool handles this correctly via String.fromCodePoint rather than the older String.fromCharCode, which only covers the Basic Multilingual Plane.

Tips and notes

You can mix notations in a single list — U+0048 0x65 108 U+006C 6F is valid and resolves to “Hello”. If you have an ambiguous all-digit value that is meant to be hex, add a 0x or U+ prefix so it is not misread as decimal. Surrogate or over-range values are rejected with a clear message rather than silently producing the replacement character.