3

Is it possible for one XPath expression to match all the following <a> elements using the text in the element, in this case "Link"?

Examples:

  1. <a href="blah">Link</a>
  2. <a href="blah"><span>Link</span></a>
  3. <a href="blah"><div>Link</div></a>
  4. <a href="blah"><div><span>Link</span></div></a>
kjhughes
  • 89,675
  • 16
  • 141
  • 199
StevieD
  • 2,329
  • 13
  • 30

2 Answers2

6

This simple XPath expression,

//a[contains(., 'Link')]

will select the a elements of all of your examples because . represents the current node (a), and contains() will check the string value of a to see if it contains 'Link'. The string value of a already conveniently abstracts away from any descendent elements.

This even simpler XPath expression,

//a[. = 'Link']

will also select the a elements in all of your examples. It's appropriate to use if the string value of a will exactly equal, rather than just contain, "Link".

Note: The above expressions will also select <a href="blah">Li<br/>nk</a>, which may or may not be desirable.

kjhughes
  • 89,675
  • 16
  • 141
  • 199
  • +1 - I was originally going to post `//a[contains(., 'Link')]`, but it doesn't strictly match text because as you pointed out, it would match `Li
    nk
    ` and `Link` so it's safer to use `text()` in my opinion...
    – Josh Crozier Feb 10 '16 at 20:53
  • @JoshCrozier, thanks, and your answer is very useful too (+1), but safest is really for OP to understand how these variations work so he can chose what he intends. Also, note that ***[XPath `text() =` is different than XPath `. =`](http://stackoverflow.com/a/34595441/290085)*** – kjhughes Feb 10 '16 at 20:59
3

You could use the following:

//a[(.//*|.)[contains(text(), "Link")]]

This will select a elements that contain the text "Link" or a elements that have a descendant element that contains the text "Link".

  • //a - Select all a elements
  • ( - Open OR grouping
  • .//* Select all the descendant nodes
  • | - Or..
  • . - Select the current node
  • ) - Close OR grouping
  • [contains(text(), "Link")] - If they contain the text "Link"

Alternatively, you could also use:

//a[(.//*|.)[.="Link"]]
Josh Crozier
  • 202,159
  • 50
  • 343
  • 273
  • Hmmm, tried the first suggestion and no match was found. The second suggestion found a different link in the document other than the one I wanted. There is only on hyperlink on the page with the text "Link" so not sure how that happened. – StevieD Feb 10 '16 at 20:16
  • 1
    OK, I took a wild guess and threw parentheses around the "or" expression. That did the trick. Thanks for your help! `//a[(.//*|.)[text() = \"$text\"]]` – StevieD Feb 10 '16 at 20:22