-3

I am trying to fetch a table from the website using BeautifulSoup in Python. However when I print the table it shows none.

import pygsheets
import pandas as pd
import bs4
from urllib.request import urlopen
import requests
from bs4 import BeautifulSoup

#enter url
url = "https://www.covid19india.org/"

#get the html

r = requests.get(url)
htmlContent = r.content

soup = BeautifulSoup(htmlContent, 'html.parser')
table = soup.find('table', {'class' : 'table fadeInUp'})

print(table)

Screenshots of My python code. Please help

Humayun Ahmad Rajib
  • 1,472
  • 1
  • 8
  • 20

2 Answers2

0

You cannot find the table because it's not there.

Try it yourself from command line:

curl https://www.covid19india.org/

You will see that the result is some basic HTML wrapper around a ton of javascript, and that javascript fetches and renders the actual table. Of course, BeautifulSoup does not run javascript.

If you open the URL in browser, and look at the network traffic, you will see that the real data come from https://api.covid19india.org/state_district_wise.json, and some others. It's served in a nice JSON format.

9000
  • 37,110
  • 8
  • 58
  • 98
  • Thank you for sharing your not knowledge on limitation of BeautifulSoup over Javascript content. And yes I have got the answer to my question from https://stackoverflow.com/questions/42856915/python-selenium-get-content-of-table – monsoon dibragede May 08 '20 at 01:20
-1

Try printing the entire htmlContent and check if the table exists in this raw HTML. All the html components may not get rendered in case of dynamically generated pages.