The C string literal escaper and unescaper converts arbitrary text into the exact content you can place between the quotation marks of a C or C++ string literal, and turns escaped literal content back into readable text. It is essential when embedding paths, JSON, multi-line messages or binary data in C source.
How it works
Escaping follows the C standard. The backslash and double quote are escaped to \\ and \" so they do not terminate or corrupt the literal. The familiar control characters map to named escapes: newline is \n, tab is \t, the alert/bell byte is \a, and so on. Any remaining byte below 32, plus DEL, is written as a hex escape \xHH. Characters outside ASCII are first encoded to their UTF-8 bytes, and each byte becomes its own \xHH, keeping the literal seven-bit clean.
Unescaping is the reverse and is byte oriented so it can reconstruct UTF-8 correctly. It recognises the named escapes, hex escapes \xH... (greedy on hex digits, as the C standard specifies), and octal escapes of up to three digits such as \101. Unknown escapes drop the backslash and keep the following character. The collected bytes are decoded as UTF-8 at the end.
Example
Escaping a Windows path with an embedded newline and quotes:
Path: C:\\dir\nLine \"two\"
Note that the single backslash in C:\dir becomes \\, the newline becomes \n, and the inner quotes become \". Unescaping that content restores the original text exactly.
Notes
- Hex escapes in C are greedy:
\x41is the letter A, but\x4142would be one (overflowing) value, so prefer octal or splitting strings when ambiguity matters. - The output deliberately omits the surrounding quotes so you control the literal in your own code.
- A non-ASCII character such as é is emitted as its two UTF-8 bytes
\xc3\xa9.