XPath is the expression language for addressing parts of an XML (or HTML) document. It is the backbone of XSLT, XQuery and most browser/automation element selectors. An XPath is a sequence of steps, each navigating an axis from a context node and optionally filtered by predicates. This reference lists the axes, syntax and the functions you reach for most often.
How it works
A location path is a series of steps separated by /. Each step has the form
axis::node-test[predicate]. Starting from a context node, the engine moves along
the named axis (e.g. child, descendant, following-sibling), keeps nodes
matching the node test (an element name, *, text(), @attr), then filters
with any predicates. Steps compose to walk the tree precisely. Abbreviations make
common paths short: // is descendant-or-self::, @ is attribute::, . is
the context node and .. is the parent.
Worked examples
//div[@class='item'] every div whose class is "item", any depth
//ul/li[1] the first li child of each ul
//a[contains(@href,'pdf')] anchors whose href contains "pdf"
//input[@type='text']/.. the parent of each text input
count(//tr) number of table rows
Notes and tips
XPath indexes from 1, not 0 — [1] is the first node. A bare text() returns
the text node, while string(.) returns the concatenated string value of an
element. Prefer normalize-space() when matching text to ignore surrounding
whitespace. XPath 1.0 is what browsers and Selenium use; reach for Saxon when you
need 2.0+ features like matches(), tokenize() and sequences.