Tamil is one of the world’s oldest living languages and is highly agglutinative: a single written word can carry a root plus several suffixes for case, tense, and politeness, all with no intervening space. This free tool counts words the conventional way — by whitespace and punctuation boundaries — and adds sentence, character, and average-length statistics so you can size essays, captions, and subtitles accurately.
How it works
The counter trims the text, then splits it on Unicode whitespace and on common punctuation, including the comma, full stop, semicolon, colon, brackets, quotes, and dashes. Each non-empty token is one word. A token is additionally classified as a Tamil word when it contains at least one character from the Tamil Unicode block (U+0B80 to U+0BFF), which keeps embedded numbers and Latin text out of the Tamil-word figure.
Sentences are found by splitting on ., !, ?, and line breaks. Characters are reported both with and without spaces, and the average word length divides the total character length of all tokens by the number of tokens.
Tips and notes
Because suffixes attach without a space, Tamil word counts are often lower than a naive translation might suggest — one Tamil word can express what English needs three or four words for. When you have a strict word limit, count in Tamil rather than estimating from a translation.
If your text was pasted from a PDF, watch for zero-width joiners and non-breaking spaces; the splitter treats standard whitespace as a boundary, and the Tamil-word figure ignores tokens with no Tamil letters, which helps surface accidental gibberish tokens.