POSIX defines two regular-expression dialects — Basic (BRE) and Extended (ERE) — and the difference between them trips up almost everyone who moves between grep, sed and awk. The same pattern can mean completely different things depending on which flavor a tool uses by default. This reference puts the two dialects side by side, lists the standard bracket character classes, and tells you which tool speaks which dialect.
How it works
In a Basic Regular Expression the special characters that do grouping,
alternation and counting are written with a leading backslash: grouping is
\( \), alternation is \|, and a bound is \{m,n\}. Without the backslash
those characters are literal. In an Extended Regular Expression the same
constructs are written bare — ( ), |, {m,n} — and a backslash makes them
literal instead. The anchors ^ and $, the dot ., the star *, and bracket
expressions [...] behave the same in both dialects.
POSIX has no single-letter shorthand classes. Instead it provides named
bracket character classes like [:alpha:], [:digit:] and [:space:], which
must appear inside a bracket expression — so a digit is [[:digit:]]. This keeps
matching correct across locales and character sets where the ASCII ranges alone
would be wrong.
Tips and notes
Remember the tool defaults: grep and sed are BRE unless you pass -E, while
awk and egrep are ERE. GNU adds convenience extensions (such as \+ and \?
in BRE) that are not portable to other Unixes, so a script that must run anywhere
should stick to the strict POSIX forms shown in the table.