How to extract rating information using CSS selector or any other methods

Question

I am learning web scraping on my own and I am trying to scrap reviewer's ratings on Yelp as a practice. Typically, I can use CSS selector or XPath methods to select the contents I am interested in. However, those methods do not work for selecting reviewers' ratings. For instance, on the following page: https://www.yelp.com/user_details_reviews_self?userid=0S6EI51ej5J7dgYz3-O0lA. The CSS selector for the first rating is '.stars_2'. However, if I use this selector in my RSelenium code as follows:

     ratings=remDr$findElements('css selector','.stars_2')

     ratings=unlist(lapply(ratings, function(x){x$getElementText()}))

I get NULL. I think the reason is that the rating is actually a image. I paste a small part of the page source here:

            <div class="review-content">
            <div class="review-content">
            <div class="biz-rating biz-rating-very-large clearfix">
            <div>        
            <div class="rating-very-large">
            <i class="star-img stars_2" title="2.0 star rating">
          <img alt="2.0 star rating" class="offscreen" height="303" src="//s3-media4.fl.yelpcdn.com/assets/srv0/yelp_styleguide/c2252a4cd43e/assets/img/stars/stars_map.png" width="84">
    </i>
</div>


    </div>

Basically, if I can extract the text from class="stat-img stars_2" or title="2.0 star rating" then I am good. Can anyone help me on this?

This is prohibited by [Yelp's TOS](https://www.yelp.com/static?p=tos) (term 6 B iii). — Gregor Thomas, Mar 15 '16 at 23:01
I am just using it as a practice actually. I won't use it for any other purpose. I am just curious how to deal with scenarios like this. — Allen, Mar 15 '16 at 23:38

score 0 · Answer 1 · edited May 23 '17 at 11:59

0

You might want to try something like this approach:

Using the Yelp API with R, attempting to search business types using geo-coordinates

though it seems some folks found this outdated, I found some useful code on the Yelp github page:

https://github.com/Yelp/yelp-api/pull/88 https://github.com/Yelp/yelp-api/pull/88/commits/95009afde2b47e8244fda3d435f0476205cc0039

Good luck! :)

edited May 23 '17 at 11:59

Community

1
1

answered Mar 15 '16 at 23:37

Joe R

51
5

Thanks! But the API can only work on a business level. If I want to extract individual ratings, I do not think API can help. Correct me if I am wrong. – Allen Mar 16 '16 at 00:57

How to extract rating information using CSS selector or any other methods

1 Answers1