Tagalog Word Counter

Count words in Filipino/Tagalog text with enclitic particle handling

Count Tagalog words two ways: the standard orthographic count, and a lexical count that merges bound enclitic particles (na, nga, ba, pa, raw, daw) into their host word. Also reports sentences and characters. Runs in your browser.

What are enclitic particles in Tagalog?

They are short particles like na, nga, ba, pa, raw, daw, din, rin and po that are written as separate words but are grammatically bound to the word before them. They add meaning such as completion, emphasis, questions, or reported speech without standing alone semantically.

Counting Tagalog words, particles included

A plain word counter splits on spaces and calls it done. That works for English, but Tagalog sprinkles short enclitic particlesna, nga, ba, pa, raw, daw — that are written separately yet attach to the word before them. This tool gives you both the standard orthographic count and a particle-aware lexical count so the number means what you intend.

How it works

The counter tokenizes the text on whitespace and keeps only tokens that contain at least one letter or digit. From there it produces two figures:

  • Orthographic words is simply the number of word tokens — the count generic tools report, and the one to use for length limits.
  • Lexical words walks the tokens and, whenever a token (other than the first) is a known enclitic particle, attaches it to the previous word instead of counting it separately.

It also reports the number of enclitic particles detected, an approximate sentence count from terminal punctuation, and the character total without spaces.

Tips and example

For example, Kumain na ba kayo is four orthographic words but only two lexical content words once na and ba merge onto Kumain. The particle list covers the common bound forms (na, nga, ba, pa, raw, daw, din, rin, po, ho, lang, naman, kasi, and others). For character-level counting that respects the ng digraph, use the companion Tagalog Character Counter.