Thai is written without spaces between words — a space marks a phrase or sentence break, not a word boundary. That makes counting words a real segmentation problem. This free tool splits continuous Thai into words using a client-side longest-match dictionary and counts them.
How it works
The tool uses greedy maximal matching. Starting at each position in the text, it searches its built-in Thai dictionary for the longest word that matches from that point. It emits that word as one token and advances past it, then repeats from the new position. When no dictionary word matches, it consumes a single grapheme cluster — a base consonant plus any stacked vowels and tone marks — as an unknown token so the scan never stalls.
Latin runs, digit runs, and punctuation are each tokenised separately, and explicit spaces act as hard boundaries. The word count is the number of dictionary and unknown Thai tokens plus any Latin/numeric tokens.
Tips and notes
Longest-match is fast and works well for everyday vocabulary, but it is a heuristic: proper names, technical terms, and compounds that aren’t in the dictionary may be mis-split, and a single greedy choice can occasionally pick the wrong boundary. For important documents, scan the segmented word list shown below the count to confirm the split looks right. Everything runs locally in your browser, so your text is never uploaded.