I am working on a school project and want to get all user reviews of superhero movies of IMDB.
First, I try to get all user reviews of only 1 movie.
The page of user reviews, consists of 25 user reviews and a 'load more' button. While I already managed to write a code to open the load more button. I get stuck in the second part: getting all user reviews in a list.
I already tried to use BeautifulSoup to find all 'content' parts on the page. However, my list remains empty.
from bs4 import BeautifulSoup
testurl = "https://www.imdb.com/title/tt0357277/reviews?ref_=tt_urv"
patience_time1 = 60
XPATH_loadmore = "//*[@id='load-more-trigger']"
XPATH_grade = "//*[@class='review-container']/div[1]"
list_grades = []
driver = webdriver.Firefox()
driver.get(testurl)
# This is the part in which I open all 'load more' buttons.
while True:
try:
loadmore = driver.find_element_by_id("load-more-trigger")
time.sleep(2)
loadmore.click()
time.sleep(5)
except Exception as e:
print(e)
break
print("Complete")
time.sleep(10)
# When the whole page is loaded, I want to get all 'content' parts.
soup = BeautifulSoup(driver.page_source)
content = soup.findAll("content")
list_content = [c.text_content() for c in content]
driver.quit()
I expect to get a list of all content of the review-containers on the website. However, my list remains empty.