21

I have some HTML like this:

<span id="cod">Code:</span> <span>12345</span>
<span>Category:</span> <span>faucets</span>

I want to fetch the category name ("faucets"). This is my trial:

var $ = cheerio.load(html.contents);
var category = $('span[innerHTML="Category:"]').next().text();

But this doesn't work (the innerHTML modifier does not select anything).

Any clue?

Josh Crozier
  • 202,159
  • 50
  • 343
  • 273
MarcoS
  • 15,673
  • 23
  • 78
  • 152

2 Answers2

45

The reason your code isn't working is because [innerHTML] is an attribute selector, and innerHTML isn't an attribute on the element (which means that nothing is selected).

You could filter the span elements based on their text. In the example below, .trim() is used to trim off any whitespace. If the text equals 'Category:', then the element is included in the filtered set of returned elements.

var category = $('span').filter(function() {
  return $(this).text().trim() === 'Category:';
}).next().text();

The above snippet will filter elements if their text is exactly 'Category:'. If you want to select elements if their text contains that string, you could use the :contains selector (as pointed out in the comments):

var category = $('span:contains("Category:")').next().text();

Alternatively, using the .indexOf() method would work as well:

var category = $('span').filter(function() {
  return $(this).text().indexOf('Category:') > -1;
}).next().text();
Josh Crozier
  • 202,159
  • 50
  • 343
  • 273
  • 1
    Works like a charm, thanks. Didn't know about `filter()`, I'm feeling quite dumb... :-( – MarcoS Jan 10 '16 at 19:29
  • 8
    If he wants to check if it contains the string he can also just use `$('span:contains("Category:")')` – Paul Jan 10 '16 at 19:29
  • 1
    @Paulpro Does Cheerio have a `:contains` selector? I checked [the documentation](https://github.com/cheeriojs/cheerio), and I didn't see it in there, so I didn't use it. – Josh Crozier Jan 10 '16 at 19:30
  • 1
    @JoshCrozier I don't know Cheerio, but the tag wiki says it's an implementation of jQuery core, so I would assume so. – Paul Jan 10 '16 at 19:33
  • 4
    @JoshCrozier It appears that it uses this library for selectors, which supports contains: https://www.npmjs.com/package/CSSselect – Paul Jan 10 '16 at 19:34
1

A simpler solution is:

var category = $('span:contains("Category:") + span').text()

This is css plus the :contains pseudo that is part of jQuery and supported by cheerio.

pguardiario
  • 48,260
  • 17
  • 98
  • 132