-1

I'm using BeautifulSoup for a while and I've hadn't had much problems. But now I'm trying to scrape from a site that gives me some problem. My code is this:

    preSoup = requests.get('https://www.betbrain.com/football/world/')
    print(currUrl)
    soup = BeautifulSoup(preSoup.content,"lxml")
    print(soup)

the content I get seems to be some sort of script and/or api they're connected to, but not the real content of the webpage I see in the browser. I cant reach the games for example. Does anyone knows a way around it? Thank you

  • Possible duplicate of [scrape html generated by javascript with python](https://stackoverflow.com/questions/2148493/scrape-html-generated-by-javascript-with-python) – bobrobbob Jun 27 '18 at 11:06

1 Answers1

1

Okay requests gets only the html and doesnt load the js you have to use webdriver for that you can use Chrome, Firefox and etc.. i use PhantomJS because is running in the background its "headless" browser. Underneath you will find some example code that will help you understand how to use it

from bs4 import BeautifulSoup
import time
from selenium import webdriver
driver = webdriver.PhantomJS()
driver.get("https://www.betbrain.com/football/world/")
time.sleep(5)# you can give it some time to load the js 
html = driver.page_source
soup = BeautifulSoup(html, 'lxml')
for i in  soup.findAll("span", {"class": "Participant1"}):
    print (i.text)
ThunderHorn
  • 1,653
  • 1
  • 9
  • 31