What is the difference between / and // in XPath?

A single slash / selects from the immediate children of the context node, so /html/body selects body only if it is a direct child of html. A double slash // is shorthand for the descendant-or-self axis, so //div selects div elements at any depth below the context.

How do XPath predicates work?

A predicate in square brackets filters the node-set at a step. //li[1] selects the first li within each parent (XPath indexes from 1, not 0), //a[@href] selects anchors that have an href attribute, and //p[contains(text(),'hi')] filters by text content.

What is an XPath axis?

An axis defines the direction of navigation from the context node, such as child, parent, ancestor, descendant, following-sibling or attribute. The full syntax is axis::node-test, for example ancestor::div selects all ancestor div elements. The common axes have abbreviations like @ for attribute and // for descendant-or-self.

What changed in XPath 2.0?

XPath 2.0 adds a richer type system aligned with XML Schema, sequences instead of node-sets, the for/if/some/every expressions, many new functions (string, date, regex via matches/replace/tokenize), and the union/intersect/except operators. XPath 1.0 remains common in browsers and tools like Selenium.

Does the browser support XPath?

Yes, browsers expose XPath 1.0 through document.evaluate. Many automation tools (Selenium, Playwright via the xpath= engine) also use XPath 1.0. Full XPath 2.0/3.1 support generally requires a dedicated XSLT/XQuery processor such as Saxon.

What is the XPath Reference?

Searchable XPath reference covering node-selection axes (child, descendant, ancestor, following-sibling), predicates, location-path syntax and the most-used built-in functions across XPath 1.0 and 2.0. It runs free in your browser on Gera Tools, with nothing uploaded.

XPath Reference — Gera Tools

Name: XPath Reference
Creator: Gera Tools
License: https://creativecommons.org/licenses/by/4.0/

XPath is the expression language for addressing parts of an XML (or HTML) document. It is the backbone of XSLT, XQuery and most browser/automation element selectors. An XPath is a sequence of steps, each navigating an axis from a context node and optionally filtered by predicates. This reference lists the axes, syntax and the functions you reach for most often.

How it works

A location path is a series of steps separated by /. Each step has the form axis::node-test[predicate]. Starting from a context node, the engine moves along the named axis (e.g. child, descendant, following-sibling), keeps nodes matching the node test (an element name, *, text(), @attr), then filters with any predicates. Steps compose to walk the tree precisely. Abbreviations make common paths short: // is descendant-or-self::, @ is attribute::, . is the context node and .. is the parent.

The axes at a glance

Each axis defines a direction of travel from the current context node. The most commonly used ones are:

Axis	What it selects
`child`	Direct child nodes (the default axis; `child::p` and `p` are equivalent)
`descendant`	All descendants, any depth
`descendant-or-self`	Descendants plus the context node itself (abbreviated `//`)
`parent`	The single immediate parent
`ancestor`	All ancestors up to the root
`following-sibling`	Siblings that appear after the context node
`preceding-sibling`	Siblings that appear before it
`attribute`	Attributes of the context element (abbreviated `@`)
`self`	The context node itself (abbreviated `.`)

Worked examples

//div[@class='item']           every div whose class attribute is exactly "item"
//ul/li[1]                     the first li child of each ul element
//a[contains(@href,'pdf')]     anchors whose href contains the string "pdf"
//input[@type='text']/..       the parent element of each text input
count(//tr)                    number of tr elements in the whole document
//p[not(contains(@class,'hidden'))]    paragraphs that do not have class "hidden"
//table[2]//td[last()]         last cell in every row of the second table
//*[@id='main']/following-sibling::*   all elements after the one with id="main"

Predicates in depth

Predicates filter within square brackets. They can test attributes, position, text content, or any XPath expression that evaluates to a boolean or number. A few patterns worth memorising:

[1] and [last()] — positional; XPath counts from 1, not 0.
[@attr] — tests that the attribute exists, regardless of value.
[@attr='value'] — exact attribute match.
[contains(text(),'word')] — substring match on the text content.
[normalize-space()='exact phrase'] — trims whitespace before matching.
[position() mod 2 = 0] — even-positioned nodes (for striped table rows).

Predicates can be chained: //tr[@class='row'][position() > 2] selects class-row rows that are not the first two.

Notes and tips

XPath indexes from 1, not 0 — [1] is the first node. A bare text() returns the text node, while string(.) returns the concatenated string value of an element. Prefer normalize-space() when matching text to ignore surrounding whitespace. XPath 1.0 is what browsers and Selenium use; reach for Saxon when you need 2.0+ features like matches(), tokenize() and sequences.

When targeting HTML with browser automation (Playwright, Selenium), // paths starting from the document root can be slow on large pages because they scan every node. Anchor your path to a known ID or landmark first — //div[@id='results']//li is faster than //li — and prefer CSS selectors for simple class or ID lookups, saving XPath for cases where you genuinely need an axis (like selecting a parent) or a function (like contains).