Counting sentences in Arabic needs care because Arabic punctuation is not the
same as Latin punctuation. The question mark is mirrored (؟, U+061F), the comma
is the Arabic comma (،, U+060C), and some texts use the Arabic full stop (۔,
U+06D4) instead of the ordinary period. This counter knows the difference and
splits only on genuine sentence terminators.
How it works
The text is split on runs of sentence-ending marks: the period ., the Arabic
question mark ؟, the exclamation mark !, the Arabic full stop ۔, and the
ellipsis …. Consecutive terminators are collapsed into a single break, so a
sequence like ؟! ends just one sentence rather than creating an empty one.
Each resulting segment that still contains Arabic or alphanumeric content counts
as one sentence. Crucially, the Arabic comma ، and the Arabic semicolon ؛ are
not split on — they separate clauses inside a sentence, not whole sentences.
Example
The passage:
اللغة العربية جميلة. هل تتحدث العربية؟ نعم!
contains three sentences: a statement ending in a period, a question ending in the Arabic question mark, and an exclamation. The comma inside a longer clause would not add to the count.
Notes
- Both the ASCII period and the Arabic full stop
۔are accepted as endings. - Clause separators
،and؛are ignored by design. - The average-words-per-sentence figure is a quick readability check.
- Everything runs locally; your text never leaves the browser.