I want to read the html source code of say www.google.com with ajax or jquery (I don't just want to display the source, i need to parse it, so having xmlhttp.responseText is nice).
read contents of an external webpage and get specific elements has a nice way of doing it serverside w/ php Can Javascript read the source of any web page? is nice if you are trying to read a page of local domain
yql+JSON is a possibility, as noted in above, but seems slow and a lot of overhead
i'd prefer ajax, cuz I don't need to load a 90k jquery lib, and as far as I can see...
var xmlhttp=null;
var url = 'bot.html?url=http://google.com'; //must redirect in bot.html
//var url='http://www.google.com'; wont work, 0 xmlhttp.status error
if (window.XMLHttpRequest) { // code for IE7+, Firefox, Chrome, Opera, Safari
xmlhttp=new XMLHttpRequest(); //src says buggy for IE7
} else {// code for IE6, IE5
xmlhttp=new ActiveXObject("Microsoft.XMLHTTP");
}
xmlhttp.open("GET",url,true);
xmlhttp.send(null);
xmlhttp.onreadystatechange=function() {
if (xmlhttp.readyState==4 && xmlhttp.status==200) {
document.getElementById("result").innerHTML= xmlhttp.responseText;
}
}
is much the same as jquery...
$("#result").load(url);
unmentioned in other mentioned stackoverflow is how to handle the ?url= . I did (as keeping all js)...
bot.html:
<head>
<script type="text/javascript">
var vars = query.split("&");
var pair = vars[0].split("=");
if (pair[0]=='url') { // ex bot.html?url=http://www.google.com
alert('hi '+pair[1]);
window.location = pair[1];
//top.location.href=pair[1]; or
}
</script>
... above jquery or ajax ...
<div id="result">Fill Me</div>
All this works fine for a local page var url='index.php' (without redirect), HOWEVER, none of this works for external links, like google.com, I can't seem to var url='google.com' and if I try to proxy (as eluded to for jquery, without example, in above mentioned stackoverflow) it loads the source for bot.html (itself) (never doing the alert or redirect), which makes sense i think, cuz it is loading, not doing. I figured I could use the same proxy trick for ajax.
trying to redirect / proxy by .htaccess wont fit for this application