I am trying to create a a basic web crawler the specifically looks for links from adverts.
I have managed to find a script that uses cURL to get the contents of the target webpage
I also found one that uses DOM
<?php
$ch = curl_init("http://www.nbcnews.com");
$fp = fopen("source_code.txt", "w");
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_exec($ch);
curl_close($ch);
fclose($fp);
?>
These are great and I certainly feel like I'm heading in the right direction except quite a few adverts are displayed using JS and as it's client side, it obviously isn't processed and I only see the JS code and not the ads.
Basically, is there any way of getting the JS to execute before I start trying to extract the links?
Thanks