As in this Stack Overflow answer imagine that you need to select a particular table and then all the rows of it. Due to the permissiveness of HTML, all three of th开发者_开发技巧e following are legal markup:
<table id="foo"><tr>...</tr></table>
<table id="foo"><tbody><tr>...</tr></tbody></table>
<table id="foo"><tr>...</tr><tbody><tr>...</tr></tbody></table>
You are worried about tables nested in tables, and so don't want to use an XPath like
table[@id="foo"]//tr
.
If you could specify your desired XPath as a regex, it might look something like:
table[@id="foo"](/tbody)?/tr
In general, how can you specify an XPath expression that allows an optional element in the hierarchy of a selector?
To be clear, I'm not trying to solve a real-world problem or select a specific element of a specific document. I'm asking for techniques to solve a class of problems.
I don't see why you can't use this:
//table[@id='foo']/tr|//table[@id='foo']/tbody/tr
If you want one expression without node set union:
//tr[(.|parent::tbody)[1]/parent::table[@id='foo']]
In XPath 2.0, the optional step can be expressed as (tbody|.)
.
//table[@id="foo"]/(tbody|.)/tr
XPathTester.com demo
The pipe (|
) denotes union (of two node-sets), the dot (.
) denotes identity step (returning just what the previous step did).
This can be expanded to include more optional elements at once:
//table[@id="foo"]/(thead|tbody|tfoot|.)/tr
Use:
//table[@id="foo"]/*[self::tbody or self::thead or self::tfoot]/tr
|
//table[@id="foo"]/tr
Select any tr
element that is a child of any table
that has an id
attribute "foo" or any tr
element that is a child of a tbody
that is a child any table
.
精彩评论