The following curl
-call succeeds every time, if and only if $data
is printed after the curl
-call. curl_getinfo()
returning
[content_type] => text/html; charset=UTF-8
If $data
is not printed, the curl
-call sometimes return the same result as above and sometimes returns $data
being "Loading...", Which means that page has not finished loading yet. And curl_getinfo()
returning
[content_type] => text/html
Furthermore, when using print_r($data)
, I can see the print_r(curl_getinfo($ch));
on my website being updated several times while performing the curl
-call. What... The.... F?
(the set_opt
-list has grown larger as I'm trying to find a solution LOL)
Ooh.. yeah, even if I print $data
after it's been returned to function caller and caught in another variable.. curl
succeeds every time.
Is this normal behaviour? I don't want to print_r($data)
!
Is it possible that the url I'm retrieving contains javascript which gets run when I "print" it on my website? Why does it work occasionally without the print_r($data)
? Ref: is-there-a-way-to-let-curl-wait-until-the-pages-dynamic-updates-are-done
edit: Until further notice, I've put the curl
-call in a while-loop, checking if downloaded size is above a certain threshold. I've set the while loop to 10 iterations, and so far it is enough, i.e. it will manage to download the content of interest. Time consumed is barely noticed.
function curl_get_contents($url) {
global $dbg;
$ch = curl_init();
$timeout = 30;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_NOSIGNAL, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
//curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.52 Safari/537.17');
curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_ANY);
curl_setopt($ch,CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, true);
$data = curl_exec($ch);
if ($dbg) {
print_r(curl_getinfo($ch)); // This one gets refreshed if print_r($data) used below
if(curl_errno($ch)){
echo 'Curl error: ' . curl_error($ch);
} else {
echo "ALL GOOD <br>";
}
}
curl_close($ch);
//echo $data; // If I do this...
//print_r($data); // ... or this. curl is success 100%.
return $data;
}