151

I'm trying to learn XPath. I looked at the other contains() examples around here, but nothing that uses an AND operator. I can't get this to work:

//ul[@class='featureList' and contains(li, 'Model')]

On:

...
<ul class="featureList">

<li><b>Type:</b> Clip Fan</li><li><b>Feature:</b> Air Moved: 65 ft.
    Amps: 1.1
    Clip: Grips any surface up to 1.63"
    Plug: 3 prong grounded plug on heavy duty model
    Usage: Garage, Workshop, Dorm, Work-out room, Deck, Office & more.</li><li><b>Speed Setting:</b> 2 speeds</li><li><b>Color:</b> Black</li><li><b>Power Consumption:</b> 62 W</li><li><b>Height:</b> 14.5"</li><li><b>Width:</b> Grill Diameter: 9.5"</li><li><b>Length:</b> 11.5"</li>

<li><b>Model #: </b>CR1-0081-06</li>
<li><b>Item #: </b>N82E16896817007</li>
<li><b>Return Policy: </b></li>
</ul>
...
Yuri
  • 3,283
  • 1
  • 19
  • 37
ryeguy
  • 60,742
  • 51
  • 186
  • 256
  • this works for me, I tested it on http://www.whitebeam.org/library/guide/TechNotes/xpathtestbed.rhtm – mihi Jun 30 '09 at 17:58

5 Answers5

211

You are only looking at the first li child in the query you have instead of looking for any li child element that may contain the text, 'Model'. What you need is a query like the following:

//ul[@class='featureList' and ./li[contains(.,'Model')]]

This query will give you the elements that have a class of featureList with one or more li children that contain the text, 'Model'.

Jeff Yates
  • 58,658
  • 18
  • 135
  • 183
  • 14
    +1 -- The "./" is a bit misleading - it suggests that anything other than the current node would be taken into account when you leave it out, but in fact it is redundant: "//ul[@class='featureList' and li[contains(.,'Model')]]" is the same thing. – Tomalak Jun 30 '09 at 18:21
  • 4
    Yup, I was just being specific. Quite possibly overly specific. – Jeff Yates Jun 30 '09 at 18:43
  • If there is no `li` with `Model` in `ul`, then the `and` condition will fail. So `and` condition returns `false` on the empty set, is it correct? – Konstantin Milyutin Jan 27 '14 at 14:27
61

I already gave my +1 to Jeff Yates' solution.

Here is a quick explanation why your approach does not work. This:

//ul[@class='featureList' and contains(li, 'Model')]

encounters a limitation of the contains() function (or any other string function in XPath, for that matter).

The first argument is supposed to be a string. If you feed it a node list (giving it "li" does that), a conversion to string must take place. But this conversion is done for the first node in the list only.

In your case the first node in the list is <li><b>Type:</b> Clip Fan</li> (converted to a string: "Type: Clip Fan") which means that this:

//ul[@class='featureList' and contains(li, 'Type')]

would actually select a node!

Tomalak
  • 306,836
  • 62
  • 485
  • 598
  • 1
    nice one been struggling to figure out why queries like: ".//td[contains(.//*,'something')]" only work to a depth of 1. I'd figured out how to make it work but wasnt sure how the above was working at all. What I actually needed was ".//td[.//*[contains(.,'something')]]" – JonnyRaa Feb 05 '14 at 13:35
14

This is a new answer to an old question about a common misconception about contains() in XPath...

Summary: contains() means contains a substring, not contains a node.

Detailed Explanation

This XPath is often misinterpreted:

//ul[contains(li, 'Model')]

Wrong interpretation: Select those ul elements that contain an li element with Model in it.

This is wrong because

  1. contains(x,y) expects x to be a string, and
  2. the XPath rule for converting multiple elements to a string is this:

    A node-set is converted to a string by returning the string-value of the node in the node-set that is first in document order. If the node-set is empty, an empty string is returned.

Right interpretation: Select those ul elements whose first li child has a string-value that contains a Model substring.

Examples

XML

<r>
  <ul id="one">
    <li>Model A</li>
    <li>Foo</li>
  </ul>
  <ul id="two">
    <li>Foo</li>
    <li>Model A</li>
  </ul>
</r> 

XPaths

  • //ul[contains(li, 'Model')] selects the one ul element.

    Note: The two ul element is not selected because the string-value of the first li child of the two ul is Foo, which does not contain the Model substring.

  • //ul[li[contains(.,'Model')]] selects the one and two ul elements.

    Note: Both ul elements are selected because contains() is applied to each li individually. (Thus, the tricky multiple-element-to-string conversion rule is avoided.) Both ul elements do have an li child whose string value contains the Model substring -- position of the li element no longer matters.

See also

kjhughes
  • 89,675
  • 16
  • 141
  • 199
-2
//ul[@class="featureList" and li//text()[contains(., "Model")]]
runrig
  • 6,351
  • 2
  • 25
  • 43
-6

Paste my contains example here:

//table[contains(@class, "EC_result")]/tbody
hahakubile
  • 5,202
  • 4
  • 25
  • 18
  • 2
    There is no `table` element or `EC_result` class value in OP's code. ***This answer makes no sense here and should be deleted.*** – kjhughes May 10 '18 at 12:22