i am trying to scrape some content from pages but Beautifulsoup stuck at some pages where there is no source code , for example this one .
import requests
from bs4 import BeautifulSoup
def make_soup(url):
try:
html = requests.get(url).content
except:
return None
return BeautifulSoup(html, "lxml")
url = "https://cdn.podigee.com/uploads/u735/1d4d4b22-528e-4447-823e-b3ca5e25bccb.mp3?v=1578558565&source=webplayer"
soup = make_soup(url)
print(soup.select_one("a.next").get('href'))
This works pretty well. What happens is, if a file like .mp4 or .m4a gets in the crawler instead of an HTML page, then the script hangs :(