German Word Counter

Word count for German that joins hyphenated compounds and splits on em-dash

Counts German words with the correct rules: hyphenated compounds like E-Mail-Adresse count as one word, while em-dash and en-dash act as word separators. Also reports sentences, characters, unique words, and your longest compound noun.

Why does E-Mail-Adresse count as one word?

In German a hyphen inside a compound joins parts of a single word, so E-Mail-Adresse is one lexical unit. The counter keeps hyphen-joined tokens together rather than splitting on every hyphen.

German word counting is trickier than English because the language builds long compound nouns and uses both the hyphen (as a joiner) and the dash (as a separator). This counter applies the correct rules so your totals match how a German editor would count.

How it works

A token is recognised as a sequence of German word characters — the Latin letters plus ä ö ü ß ẞ and digits — optionally joined by hyphens or apostrophes. Because the hyphen joins, E-Mail-Adresse and Donau-Dampfschiff each count as a single word.

Before tokenising, every em-dash and en-dash is replaced by a space, so these punctuation dashes act as word boundaries. That means Berlin—München splits into two words, while E-Mail-Adresse stays as one. Sentences are counted from terminal punctuation (. ! ? …), and the longest token is tracked so you can see your biggest compound.

Example

For the text Die Donaudampfschifffahrtsgesellschaft schickt eine E-Mail-Adresse. Berlin—München ist weit. the counter reports five words — Donaudampfschifffahrtsgesellschaft is one long word, E-Mail-Adresse is one hyphenated compound, and Berlin—München is split into two. Two sentences are detected, and the longest word is the 34-letter Danube-steamship compound.

Notes

Use this when you localise English copy into German and need accurate counts for layout, subtitles, or character limits — German runs roughly 10-30% longer than English, and compound handling materially changes the totals.