Turkish Word Counter

Word count for Turkish agglutinative text with punctuation handling

Splits Turkish text on whitespace and punctuation, counting each agglutinative token as one word however many suffixes it carries, with live sentence and character counts.

Does word count depend on word length in Turkish?

No. Turkish is agglutinative, so a single word can carry many suffixes and become very long, like evlerimizden meaning from our houses. It still counts as exactly one word regardless of how long it is.

This counter is tuned for Turkish, an agglutinative language where suffixes stack onto a root to express meaning that other languages spread across several words. Because a long token like evlerimizden is still one word, the tool counts by token rather than guessing word boundaries from length.

How it works

The counter matches runs of letters and digits, allowing an internal apostrophe or hyphen between word characters. This keeps suffixed proper nouns like Türkiye'de together as one word. Turkish letters such as ç, ğ, ı, İ, ö, ş, and ü are treated as word characters. Sentences are detected from terminal punctuation, and unique words use a Turkish-aware lowercasing that maps dotted İ to i and dotless I to ı.

Tips and example

Take “Evlerimizden çıktık.” It contains two words: the long agglutinative form evlerimizden and the verb çıktık. The longest-word statistic highlights how much grammatical information a single Turkish word can carry, which is useful when estimating reading time or comparing Turkish text length against translations. Note that Turkish sometimes uses a semicolon mid-sentence; the counter does not treat it as a sentence boundary, only periods, question marks, exclamation marks, and ellipses end a sentence.