Counting words in Bengali looks simple — split on spaces — but the danda full stop, mixed punctuation, and combining marks that appear mid-word all trip up naive counters. This tool normalises those cases so the word count matches what you would tally by hand.
How it works
The counter prepares the text before splitting:
- The danda (
।) and double danda (॥) are converted to spaces, since they are sentence terminators rather than letters. - Common Western punctuation — commas, dashes, brackets, quotes — is also converted to spaces.
- The text is split on runs of whitespace, and only tokens containing at least one Bengali letter or alphanumeric character are counted, so stray symbols or lone combining marks do not inflate the total.
Crucially, combining marks (chandrabindu ঁ, anusvara ং, hasanta ্, vowel
signs) are never treated as boundaries — they stay attached to the word that
hosts them.
Example
The sentence সূর্য ওঠে — পাখি ডাকে॥ contains four words. The em dash and the
double danda are punctuation, so they are stripped before counting, leaving
সূর্য · ওঠে · পাখি · ডাকে.
Notes
- The sentence count is an estimate based on danda, double danda, and the
Western terminators
.,?, and!. - Bilingual text works fine: Latin words and numbers are counted alongside the Bengali words.