Character count and byte size are not the same thing, and confusing them causes real bugs — truncated database fields, over-length API payloads and split SMS messages. The byte counter measures the exact UTF-8 byte size of your text using the browser’s own encoder, alongside UTF-16 size and character, code-point, word and line counts.
Characters, code points and bytes
Three different numbers describe “how big” a piece of text is. Characters (JavaScript string length) count UTF-16 code units, so an emoji can register as two. Code points count each Unicode scalar once. Bytes count the encoded size — this tool’s headline figure uses UTF-8, the encoding of the modern web, HTTP and JSON. For plain English these numbers coincide; for anything with accents, non-Latin scripts or emoji they diverge, and the byte figure is the one that matters for storage and transmission limits.
Where byte size bites
- Database columns declared in bytes truncate multi-byte text early.
- API and cookie size caps are enforced in bytes, not characters.
- QR codes and SMS segments have byte-based capacity limits.
How it is measured
The UTF-8 figure comes from the browser’s TextEncoder, the same routine used when your text is actually sent or saved — so it is exact, not an approximation. Everything is computed locally; your text never leaves the page.