0

I encountered the strangest thing today while I was trying to filter image data from a string of html that I download through an AJAX request (I use https://github.com/padolsey/jQuery-Plugins/blob/master/cross-domain-ajax/jquery.xdomainajax.js to do so).

I noticed that I was getting a 404 on an image that it was trying to download. After looking at the initialiser stack, it appears that the image is inside the html that my AJAX pulls back. Here is the relevant stack:

b.extend.buildFragment   @  jquery-1.9.1.min.js:4
b.extend.parseHTML   @  jquery-1.9.1.min.js:3
b.fn.b.init  @  jquery-1.9.1.min.js:3
b    @  jquery-1.9.1.min.js:3
$.ajax.success   @  main.js:86

My code in main.js looks like this:

function generateAlbumHTML(album)
{
    $.ajax({ 
        url: album.data.url,
        type: 'GET',
        success: function(data) 
        { 
            var albumHtmlStr = "";
            var images = $(data.responseText).find('#image-container .zoom');
            $.each(images, function(i, item)
            {
                album.data.url = $(item).attr('href');
                albumHtmlStr += generateHTML(album);
            });
            return albumHtmlStr;
        }
    });
}

It appears that the culprit is line 86 where I do:

var images = $(data.responseText).find('#image-container .zoom');

This causes JQuery to parse the HTML and start loading unwanted images and data from the HTML.

Here is a link to the html that gets pulled back by the ajax request as data.responseText: http://pastebin.com/hn4jEgAA

Anyway, am I doing something wrong here? How can I filter and find the data I want from this string without loading things such as unwanted images and other data?

synergies
  • 71
  • 6
  • 2
    Yes, parsing html will turn it INTO html, therefore causing any images to be loaded. If you watn to prevent that, use a regexp that removes/changes said elements before you parse it into a dom fragment. – Kevin B Jun 06 '13 at 14:56
  • Look at this question:http://stackoverflow.com/questions/6484997/prevent-images-from-loading-on-non-dom-jquery-ajax-parse But i need to note that using regex to modify html code is not very save because you assume that every occurrence of `` is a tag. – t.niese Jun 06 '13 at 14:57
  • $(data.responseText.replace(/src\=/g,'src="dummy.gif" data-src=')).find... – mplungjan Jun 06 '13 at 15:11

1 Answers1

3

What causes the "parsing" is this:

$(data.responseText)

This is actually you, telling jQuery to create HTML structure using the string you provided in data.responseText.

If you want to find things in this string, which is the HTML in response to your GET request, then you have to use one of the corresponding String methods:

String instances methods

It should be noted, however, that what you are trying to do is quite unorthodox, since parsing HTML on the client to retrieve information is not the best of approaches.

The better way would be to either use the receieved HTML as is (provided it is from a trusted source or you sanitize it properly), or to receive raw data in JSON form and process that data (while creating corresponding HTML by yourself) in your code.

UPDATE

Additional ways are presented in jQuery ajax method

For instance, you can use dataFilter setting or some such to sanitize your response.

ZenMaster
  • 10,765
  • 5
  • 33
  • 55
  • This is what I feared. I guess I will just have to use RegEx to pull out the parts I need. I do normally parse JSON data, I was only experimenting with this method as a last ditched attempt to parse URLs and pull the image tags from it. I currently use it for imgur albums as I haven't got an API key yet. – synergies Jun 06 '13 at 15:43