Interesting problem. I'm scraping a betting site with selenium, then processing with bs4. Problem is, the way the site loads its odds information is different to how it loads the team names. For example:
London v Tokyo 2/1 4/1
Amsterdam v Helsinki 5/1 3/1
New York v California 7/1 10/1
When I pull this and iterate over it, it comes out like so:
Names = [London, Tokyo, Amsterdam, Helsinki]
Odds = [2/1, 5/1, 4/1, 3/1, 7/1, 10/1]
The odds are loading top to bottom, left to right, in varying length chunks. Which means when I try to splice the names and odds together, they won't match up.
My question is, how can I get around this? I want to eventually have the information come out so the team name is followed by its odds:
Games = [London, 2/1, Tokyo, 4/1, Amsterdam, 5/1, Helsinki, 3/1, New York, 7/1, California, 10/1]
** UPDATE ** The site is: https://www.bet365.com/#/AC/B151/C1/D50/E2/F163/ If you get a landing page then just click through. Then "Esports" on the left panel, then "All Matches" from the midpage.
Code:
from selenium import webdriver
from bs4 import BeautifulSoup
url = "https://www.bet365.com/#/AC/B151/C1/D50/E2/F163/"
driver = webdriver.Chrome()
driver.get(url)
# Then i'm navigating to the "All Matches" page
soup = BeautifulSoup(driver.page_source, 'html.parser')
teams = driver.find_elements_by_class_name("sl-CouponParticipantWithBookCloses_Name")
odds_raw = driver.find_elements_by_class_name("gl-ParticipantOddsOnly_Odds")
odds = []
teams_text = []
new_teams = []
new_odds = []
for name in teams:
teams_text.append(name.text)
Teams come in like blocks so for example: "London v Tokyo". So to get the team names separated I iterate and split them
for name in teams_text:
first, second = name.split(" v ")
new_teams.append(first)
new_teams.append(second)
Then I turn the odds that are received fractionally, and turn them into decimal:
for odd in odds_raw:
odds.append(odd.text)
for odd in odds:
first, second = odd.split("/")
new_odd = (int(first) / int(second)) + 1
new_odds.append(round(new_odd, 2))
So now I have a list of all team names, and a list of decimal odd values. This is where my problem is. The way bet365 produces it's odds for the matches are in vertical blocks of varying lengths for each game division.
So if the odds look like this:
Division 1
London v Tokyo 1 2
Amsterdam v Helsinki 3 4
Division 2
New York v California 5 6
Division 3
Sydney v Brisbane 7 8
Bali v Singapore 9 10
Berlin v Paris 11 12
Then when I pull them, the odds will come out like:
[1, 3, 2, 4, 5, 6, 7, 9, 11, 8, 10, 12]
Where the divisions are varying lengths, I'm having a hard time figuring out how to approach it.