Counting words in Spanish needs care because Spanish opens questions and
exclamations with the inverted marks ¿ and ¡. A naive counter can glue those
marks to the next word or miscount sentences. This tool tokenises Spanish text
correctly and reports words, unique words, sentences, paragraphs, and character
counts.
How it works
Words are matched as runs of letters and digits, including the Spanish letters
á é í ó ú ü ñ in both cases, with internal apostrophes and hyphens kept inside
a single token. Because the inverted marks are not letters, ¿Cómo estás?
yields the words Cómo and estás with the ¿ correctly dropped. Sentences are
split on the closing terminators — period, exclamation mark, question mark, and
ellipsis — while the opening ¿ and ¡ are treated as sentence starters, not
endings. Unique words are counted after lower-casing so Casa and casa are
the same.
Tips and example
For the text ¡Hola! ¿Cómo estás? the counter reports three words (Hola,
Cómo, estás) and two sentences, with the inverted marks never attaching to a
word. Use the unique-word count to check for repeated
vocabulary in essays, and the character counts (with and without spaces) for
fields that have strict limits such as meta descriptions or SMS messages.