1

I am creating a library for analyzing webpages. At the moment I use Selenium to access elements in a webpage with xpath.

I was considering to replace Selenium with some offline xpath tool. However after thinking I got suspicious if it is going to work since javascript might be altering the DOM. In that case it would be impossible to use an xpath tool that doesn't render the webpage.

So is Selenium running xpaths against the DOM or the actual HTML file?

Pithikos
  • 14,773
  • 14
  • 98
  • 115

2 Answers2

1

Selenium actually delegates the xpath search to the browser itself:

Selenium delegates XPath queries down to the browser’s own XPath engine, so Selenium support XPath supports whatever the browser supports.

And, of course, you can always get the source code of the page and use any other tool to parse and search inside it. I don't see the point of it, but you can.

alecxe
  • 414,977
  • 106
  • 935
  • 1,083
  • The point of it is speed. At the moment my library is dependent on network latency. With an offline tool it would be faster to analyse. But at the same time I would not be analysing what the user sees. – Pithikos Sep 17 '14 at 14:57
  • @Pithikos well, I've personally haven't ever had to delegate it from selenium to any other tool. But, speaking about speed, you would need to open up a page and save the source code anyway - so network part would be still there, right? – alecxe Sep 17 '14 at 15:01
  • True. It's just that I don't like the fact that every time I want to test something I have to wait for Selenium to open the browser and load a page when the only thing I am testing is the xpath for example. It slows down development for me.. – Pithikos Sep 17 '14 at 15:06
  • @Pithikos well, this is a tradeoff of utilizing a real browser. One option you may consider using is a headless browser, like `PhantomJS` - it is really a much faster way. – alecxe Sep 17 '14 at 15:11
  • So I tried both now and have some interesting and rather weird conclusions. With 72 xpaths against a simple webpage, *PhantomJS* takes **15.3 secs** while *Chrome* takes **3.8 secs**. – Pithikos Sep 17 '14 at 16:00
  • @Pithikos hm, interesting, thank you, is this including the page load time? – alecxe Sep 17 '14 at 16:01
  • @Pithikos FYI, there are things you can do to improve the PhantomJS load speed, see http://stackoverflow.com/a/19946901/771848 and https://code.google.com/p/phantomjs/issues/detail?id=525. – alecxe Sep 17 '14 at 16:07
0

Selenium runs against the current DOM. This includes XPath expressions, otherwise it would be impossible to automate single page applications.

Artjom B.
  • 58,311
  • 24
  • 111
  • 196