GB2312 and its backward-compatible superset GBK are the standard legacy encodings for Simplified Chinese on the mainland. ASCII stays single-byte while each Chinese character is stored as a two-byte sequence. This tool encodes Simplified Chinese text into GBK hex bytes and decodes GBK bytes back into text, using the wider GBK table so GB2312 data round-trips too.
How it works
ASCII characters 0x00–0x7F are one byte. Each Chinese character is two bytes:
a lead byte in 0x81–0xFE followed by a trail byte in 0x40–0xFE (skipping
0x7F). GB2312 occupies a subset of this space, and GBK fills in the rest, which
is why a single GBK decoder handles both.
To stay faithful to the real table, the tool enumerates the single-byte range and every valid lead/trail pair, decodes each with the browser’s native GBK decoder, and builds a character-to-bytes map. Encoding looks each character up in that map; decoding runs the hex bytes through the native decoder.
Example and notes
"中文"encodes tod6 d0 ce c4— two characters, each a two-byte pair, with中asD6 D0and文asCE C4.- GBK targets Simplified Chinese; Traditional-only characters and symbols outside the set are flagged as unmapped.
- For text that mixes scripts or needs emoji and rare characters, UTF-8 is the modern choice; GBK remains useful for interoperating with legacy Chinese files and systems.