Chinese Pinyin Annotator

Add pinyin ruby annotations above every Simplified Chinese character.

Free pinyin annotator: paste Simplified Chinese and get tone-marked pinyin rendered as HTML ruby text above each character, ideal for learners, teachers and subtitles. Runs entirely in your browser.

What is pinyin and what do the tone marks mean?

Pinyin is the official romanisation of Mandarin Chinese. The diacritics show the four tones: flat (mā), rising (má), dipping (mǎ) and falling (mà). A character with no mark is read in the neutral tone. Tones change meaning, so they are essential for learners.

The Chinese Pinyin Annotator places the romanised reading of each Simplified Chinese character directly above it, using the same ruby layout found in learner textbooks and graded readers. Paste a sentence and instantly see how it should be pronounced, tone marks and all — a fast way to bridge the gap between recognising characters and saying them aloud.

How it works

Each character in your text is looked up in a built-in dictionary that maps it to its Hanyu Pinyin reading, complete with the diacritic that encodes the tone. The annotator wraps every matched character in an HTML <ruby> element with the pinyin inside an <rt> (ruby text) tag, which the browser positions above the base character. Characters that are not in the dictionary — and any non-Chinese text such as punctuation — pass through unchanged so the original layout is preserved.

The four Mandarin tones are written as ā á ǎ à over the main vowel of each syllable, following the standard tone-placement rules. A syllable with no diacritic is read in the neutral (light) tone.

Tips and notes

Pinyin annotation is most useful for short passages, vocabulary lists and subtitle lines where you want a pronunciation prompt without switching to a dictionary. Because some characters are heteronyms — 长 can be cháng (long) or zhǎng (to grow) — the tool always shows the most frequent reading; double-check context-sensitive words. The dictionary covers the highest-frequency characters that make up the bulk of everyday text, and the coverage counter under the output tells you how many characters were matched so you know when a manual reading is still needed. Everything runs locally in your browser, so private text never leaves your device.