Counting Tagalog the way the alphabet works
Counting characters in Tagalog has a twist that trips up generic tools: the ng digraph is one letter in the Abakada alphabet, not two. A word like ngayong is six letters in Abakada (ng-a-y-o-ng) but eight code points to a computer. This counter reports both numbers so you get the figure you actually need.
How it works
The tool measures the text in several ways:
- Raw characters counts Unicode code points using
Array.from, which is what a textarea, database column, or character-limit field sees. - Abakada letters first collapses every
ng(case-insensitive) into a single unit, then counts letters only. So eachngcontributes one letter instead of two. - ng digraphs found is the exact difference between the two letter totals.
- UTF-8 bytes is computed with the browser’s
TextEncoder, reflecting real storage and transmission size.
Tips and notes
For example, the sentence Ang mga bata has two ng digraphs (in Ang and mga? — note mga has no ng; in Ang and any ng word), and the Abakada count drops by one for each. The ñ used in Spanish loanwords (e.g. Mañana) is treated as a single letter and counts as two UTF-8 bytes. If you need word counts or reading time instead, use the companion Tagalog Word Counter and Reading Time tools.