Portuguese Word Counter

Word count for Portuguese handling clitics and hyphenation

Counts Portuguese words with awareness of enclitic pronouns like dá-lo and fazê-la as single word units, with live sentence, paragraph, and character counts in your browser.

How are enclitic pronouns counted?

Forms where a pronoun attaches to the verb with a hyphen, such as dá-lo, fazê-la, or the mesoclitic dir-se-ia, are treated as one word. The counter keeps any letter run with internal hyphens together as a single token.

This word counter is built for Portuguese, where pronouns frequently attach to the verb with a hyphen. A naive counter that splits on every hyphen would over-count those forms, so this tool keeps clitic chains together as single words to match how Portuguese grammar treats them.

How it works

The counter scans the text for runs of Latin letters, allowing internal hyphens and apostrophes between letters. This means a form like fazê-la or the mesoclitic dir-se-ia is matched as one token rather than two or three. Accented characters such as á, ã, ê, and ç are word characters, so words like coração are counted as a single word. Sentences come from terminal punctuation, and unique words are compared case-insensitively.

Tips and example

Consider the sentence “Não consigo dar-lhe a resposta.” Here dar-lhe is one word, giving a count of five words rather than six. The same applies to enclitic forms like vê-los and dá-lo, and to contractions written with an apostrophe such as d'água. The clitic/compound counter shows how many tokens contained a hyphen, which is a quick way to confirm the grammar-aware behaviour is working on your text.