How does it decide what counts as a word?

A word is a run of Czech letters and digits with internal hyphens and apostrophes kept inside. The counter splits on whitespace and punctuation, so words like příliš, žluťoučký, and řeřicha are counted correctly.

Are háček and accented letters handled correctly?

Yes. Every Czech diacritic letter — č, š, ž, ř, ě, ň, ď, ť, ů, and the acute-accented á, é, í, ó, ú, ý — is a Unicode letter, so the counter treats all of them as word characters, never as separators.

How are hyphenated forms handled?

A hyphen with no surrounding spaces, as in česko-slovenský, keeps the compound as one word. A spaced dash used as punctuation separates the words on either side.

Does it count characters too?

Yes. The tool reports total characters, characters excluding spaces, words, and sentences so you can check both word and character limits.

How are sentences counted?

Sentences are counted by runs of terminal punctuation (. ! ? and the ellipsis …). Consecutive terminators collapse to one boundary, so an ellipsis or ?! does not inflate the count.

Czech Word Counter — Gera Tools

Czech orthography is rich in diacritics: the háček (caron) gives č, š, ž, ř, ě, ň, ď, ť, the kroužek gives ů, and acute accents give á, é, í, ó, ú, ý. A counter that classifies characters poorly can split a word at a diacritic. This tool relies on full Unicode letter classification so every Czech letter stays inside its word, and reports accurate word, character, and sentence totals.

How it works

The algorithm treats a word as a maximal run of letters and digits with internal hyphens and apostrophes allowed:

It matches [\p{L}\p{N}] runs, permitting an internal - or ' between two such characters.
Every Czech diacritic letter is a Unicode letter and counts as a word character: č š ž ř ě ň ď ť ů á é í ó ú ý and their uppercase forms.
A hyphen inside a word, as in česko-slovenský, keeps the compound as one word; a spaced dash used as punctuation separates words.

Characters are counted two ways: every character including spaces, and the length with whitespace removed. Sentences are counted by collapsing runs of terminal punctuation (., !, ?, …) so an ellipsis or a ?! combo counts as one boundary.

Example

The text:

Žluťoučký kůň… Příliš ano? Ne-li.

contains the words Žluťoučký, kůň, Příliš, ano, Ne-li — five words. Every diacritic stays inside its word, and the hyphen keeps Ne-li whole.

Notes

Mixed Czech-English text and Latin product names are counted sensibly because Latin letters are also word characters.
Numbers like 2026 count as one word; a number glued to a suffix by a hyphen, such as 90-tých, stays one compound word.

Czech Word Counter

Email me this result

How it works

Example

Notes