Lookaround assertions
Lookahead and lookbehind are zero-width assertions: they check whether a pattern matches just after (lookahead) or just before (lookbehind) the current position without consuming any characters. They let you match a position defined by its context — for example, a digit that is followed by a currency symbol, or a word not preceded by a backslash. This reference gives the syntax for each of the four forms across PCRE, JavaScript, Python and .NET.
How it works
There are four assertions, built from two axes — direction and polarity:
(?=...) positive lookahead — next text DOES match
(?!...) negative lookahead — next text does NOT match
(?<=...) positive lookbehind — preceding text DOES match
(?<!...) negative lookbehind — preceding text does NOT match
Because they are zero-width, the engine evaluates the inner pattern, records
success or failure, then resumes at the original position. A classic use is
splitting on context without consuming a delimiter, e.g. inserting thousands
separators with (?<=\d)(?=(\d{3})+$).
Tips and notes
- The four tokens are identical across PCRE, JavaScript, Python and .NET.
- Lookbehind width is the main portability trap: keep it fixed-width unless you
know the engine (.NET, JS, PCRE2, Python
regex) allows variable width. - Lookarounds do not capture by default; add inner groups only if you need the inspected text.
- Chain assertions at one position —
(?=.*\d)(?=.*[a-z])is the standard password-policy idiom. - A negative lookahead at the end, like
foo(?!bar), is the readable way to say “foo not immediately followed by bar”.