How does it decide what counts as a word?

A word is a run of Romanian letters and digits with internal hyphens and apostrophes kept inside. The counter splits on whitespace and punctuation, so words like și, țară, and într-un are handled correctly.

Does it treat ș/ț and ş/ţ the same way?

Yes. The correct comma-below letters (ș U+0219, ț U+021B) and the legacy cedilla variants (ş U+015F, ţ U+0163) are all Unicode letters, so the counter treats every variant as a word character. Mixed encodings will not break the count.

How are elided forms like într-un counted?

As one word. A hyphen with no surrounding spaces keeps the elided compound together, so într-un, s-a, and dintr-o each count as a single word, matching Romanian orthography.

Does it count characters too?

Yes. The tool reports total characters, characters excluding spaces, words, and sentences so you can check both word and character limits.

How are sentences counted?

Sentences are counted by runs of terminal punctuation (. ! ? and the ellipsis …). Consecutive terminators collapse to one boundary, so an ellipsis or ?! does not inflate the count.

Romanian Word Counter

Romanian uses several diacritic letters — ă, â, î, and the comma-below ș and ț. A frequent pitfall is that older documents encode ș/ț with a cedilla (ş/ţ) instead. Both encodings are valid Unicode letters, so a robust counter must treat every variant as part of a word rather than as a separator. This tool does exactly that and reports accurate word, character, and sentence totals.

How it works

The algorithm treats a word as a maximal run of letters and digits with internal hyphens and apostrophes allowed:

It matches [\p{L}\p{N}] runs, permitting an internal - or ' between two such characters.
All Romanian diacritics are Unicode letters: comma-below ș/ț (U+0219/U+021B) and cedilla ş/ţ (U+015F/U+0163) alike, plus ă, â, î. Every variant counts as a word character.
A hyphen inside a word, as in într-un or s-a, keeps the elided form as one word; a spaced dash used as punctuation separates words.

Characters are counted two ways: every character including spaces, and the length with whitespace removed. Sentences are counted by collapsing runs of terminal punctuation (., !, ?, …) so an ellipsis or a ?! combo counts as one boundary.

Example

The text:

Într-o țară frumoasă, copiii s-au jucat… Nu-i așa?

contains the words Într-o, țară, frumoasă, copiii, s-au, jucat, Nu-i, așa — eight words. The elided forms Într-o, s-au, and Nu-i each count as one word because the hyphen has no surrounding spaces.

Notes

Because the comma-below and cedilla letters are unified at the word-class level, a document that mixes both encodings still counts correctly.
Numbers like 2026 count as one word; a number glued to a suffix by a hyphen, such as anii-1990, stays one compound word.

Romanian Word Counter

Email me this result

How it works

Example

Notes