Chinese uses full-width punctuation that differs from Latin marks: the sentence
full stop is the small circle 。 (U+3002), the comma is , or the enumeration
、, and the ellipsis is a doubled ……. A counter built for English would miss
the 。 and have no word boundaries to work with. This tool understands Chinese
punctuation.
How it works
The text is split on runs of sentence-ending marks: the full stop 。, the
full-width question mark ?, the full-width exclamation !, the ASCII
equivalents ! ? ., and the ellipsis …. Consecutive marks — including the
doubled Chinese ellipsis …… — collapse into a single break.
Each segment that contains Han characters or alphanumerics counts as one
sentence. Because Chinese is unspaced, the tool reports Han characters and
characters-per-sentence rather than words. The enumeration comma 、 and the
full-width comma , join clauses, not sentences, so they are never split on.
Example
The passage:
中文很美。你会说吗?
is two sentences: a statement closed by 。 and a question closed by ?. A
comma between clauses would not raise the count.
Notes
- The full stop
。is the primary sentence ending, not the dot. - The commas
、and,are clause separators and are ignored. - Han-character counts are used instead of word counts because Chinese is unspaced.
- Everything runs locally; your text never leaves the browser.