Long sentences are the single biggest readability problem in Chinese writing. Because Chinese has no spaces between words and packs meaning densely into each character, a sentence that runs past 40–50 characters becomes hard to parse at a glance. This tool splits your Simplified Chinese text into sentences, measures each one in Han characters, and shows you the distribution so you can find and fix the run-ons.
How it works
The tool segments text on the Chinese sentence-terminating punctuation marks — the full stop 。, exclamation mark ! and question mark ? — plus their ASCII equivalents . ! ?. Each chunk between these marks is one sentence.
For each sentence it counts only CJK ideographs (the Unicode range U+4E00–U+9FFF and common extensions), ignoring Latin letters, spaces and punctuation. Sentences are then sorted into buckets:
- Short: 1–15 characters
- Medium: 16–40 characters
- Long: more than 40 characters (flagged)
You can change the long threshold to match your house style.
Tips and example
Paste a paragraph such as 今天天气很好。我们去公园散步,看到很多花,还遇到了老朋友,聊了很久才回家。 and the tool reports two sentences: one short (6 characters) and one long (24 characters), with the average and the percentage over threshold.
A healthy distribution is mostly short and medium sentences. If more than roughly a quarter of your sentences are flagged as long, break them at natural clause boundaries — Chinese commas , and semicolons ; are good places to start a new sentence.