Why is the byte count larger than the character count?

In UTF-8, only basic ASCII characters take one byte each. Accented Latin, Greek and Cyrillic letters take two bytes, most other common characters take three, and emoji and other astral-plane characters take four. So any non-ASCII text has more bytes than characters.

What is the difference between characters, code points and bytes?

Characters here means JavaScript string length (UTF-16 code units), where an emoji can count as two. Code points count each Unicode scalar as one, so an emoji counts as one. Bytes is the encoded size on disk or on the wire — UTF-8 for the main figure. The three differ for anything outside basic ASCII.

Byte size matters whenever a limit is measured in bytes rather than characters: database VARCHAR(n) columns, API request-body caps, cookie size limits, QR-code capacity, and SMS segments. Checking bytes avoids surprise truncation when your text contains multi-byte characters.

What is the Byte Counter?

Free byte counter. Measure the exact UTF-8 byte size of any text using the browser's TextEncoder, alongside UTF-16 size and character, code-point, word and line counts. Ideal for database limits, API payloads and SMS. Runs entirely in your browser. It runs free in your browser on Gera Tools, with nothing uploaded.

Byte Counter — Gera Tools

Name: Byte Counter
Creator: Gera Tools
License: https://creativecommons.org/licenses/by/4.0/

Get one useful tool a week

Like this tool? Enter your email and we'll send you one genuinely useful Gera tool a week — plus a link to come back to this one. No spam, one-click unsubscribe any time.

Character count and byte size are not the same thing, and confusing them causes real bugs — truncated database fields, over-length API payloads and split SMS messages. The byte counter measures the exact UTF-8 byte size of your text using the browser’s own encoder, alongside UTF-16 size and character, code-point, word and line counts.

Characters, code points and bytes

Three different numbers describe “how big” a piece of text is. Characters (JavaScript string length) count UTF-16 code units, so an emoji can register as two. Code points count each Unicode scalar once. Bytes count the encoded size — this tool’s headline figure uses UTF-8, the encoding of the modern web, HTTP and JSON. For plain English these numbers coincide; for anything with accents, non-Latin scripts or emoji they diverge, and the byte figure is the one that matters for storage and transmission limits.

Where byte size bites

Database columns declared in bytes truncate multi-byte text early.
API and cookie size caps are enforced in bytes, not characters.
QR codes and SMS segments have byte-based capacity limits.

How it is measured

The UTF-8 figure comes from the browser’s TextEncoder, the same routine used when your text is actually sent or saved — so it is exact, not an approximation. Everything is computed locally; your text never leaves the page.