-1

I was trying to scrape https://www.etf.com/KJUL to get a table on the page> enter image description here

I wrote the code in python.

-

from selenium import webdriver
from selenium.webdriver.common.keys import Keys    
path = "C:\Program Files (x86)\chromedriver.exe"
driver =webdriver.Chrome(path)
url ="https://www.etf.com/KJUL#overview"
driver.get("https://www.etf.com/KJUL") 
print(driver.page_source)
search =driver.find_elements_by_tag_name("rowText")

I am not able to get anything from the parsed content, as I can't get any tags to relate to the table. How can I get the table?

  • The target page leans heavily on JavaScript to populate its content, including the content you’re trying to target. BeautifulSoup is an HTML parser and does not evaluate JavaScript. Inspect the content of `r`, which will confirm this. Use a browser control utility like Selenium or Puppeteer, which will evaluate the JavaScript on the page as your own browser would. – esqew May 13 '21 at 11:44
  • Which table do you want exactly? – baduker May 13 '21 at 11:44
  • 1
    Duplicate of [Web-scraping JavaScript page with Python](https://stackoverflow.com/questions/8049520/web-scraping-javascript-page-with-python) – esqew May 13 '21 at 11:44
  • @baduker KJUL summary table as shown in the image – yashaswi kashyap May 13 '21 at 11:59

1 Answers1

0

rowText is not a tag name. You would need to write a valid locator to scrape rowText.

Code :

driver.get("https://www.etf.com/KJUL")
driver.execute_script("window.scrollTo(0, 1000)")
KJUL_list = driver.find_elements(By.CSS_SELECTOR, "section.generalDataBox div.rowtext")
for kjul in KJUL_list:
    print(kjul.text)

O/P :

Issuer
Innovator
Inception Date
07/01/20
Legal Structure
Open-Ended Fund
Expense Ratio
0.79%
Assets Under Management
$17.25M
Average Daily $ Volume
$182.82K
Average Spread (%)
0.23%
Competing ETFs
KJAN, KAPR, KOCT
Fund Home Page
Weighted Average Market Cap
--
Price / Earnings Ratio
--
Price / Book Ratio
--
Distribution Yield
--
Next Ex-Dividend Date
N/A
Number of Holdings
--
Index Tracked
No Underlying Index
Index Weighting Methodology
Fixed
Index Selection Methodology
Fixed
Segment Benchmark
MSCI USA Small Cap Index
Expense Ratio
0.79%
Median Tracking Difference (12 Mo)
--
Max. Upside Deviation (12 Mo)
--
Max. Downside Deviation (12 Mo)
--
Max LT/ST Capital Gains Rate
20.00% / 39.60%
Capital Gains Distributions (3 Year)
--
Tax on Distributions
Qualified dividends
Distributes K1
No
Legal Structure
Open-Ended Fund
OTC Derivative Use
Yes
Securities Lending Active
No
Securities Lending Split (Fund/Issuer)
No Policy
ETN Counterparty
N/A
ETN Counterparty Risk
N/A
Fund Closure Risk
High
Portfolio Disclosure
Daily
Avg. Daily Share Volume
6,924
Average Daily $ Volume
$182.82K
Median Daily Share Volume
1,125
Median Daily Volume ($)
$29.66K
Average Spread (%)
0.23%
Average Spread ($)
$0.06
Median Premium / Discount (12 Mo)
-0.05%
Max. Premium / Discount (12 Mo)
0.93% / -0.85%
Impediment to Creations
None
Market Hours Overlap
100.00%
Creation Unit Size (Shares)
25,000
Creation Unit/Day (45 Day Average)
0.04
Creation Unit Cost (%)
0.04%
Underlying Volume / Unit
--
Open Interest on ETF Options
0
Net Asset Value (Yesterday)
$26.47
ETF.com Implied Liquidity
N/A
Goodness of Fit (R2)
0.45
Beta
0.29
Up Beta
0.27
Down Beta
0.29
Downside Standard Deviation
1.05%
Segment Benchmark
MSCI USA Small Cap Index
KJUL Number of Holdings
--
Benchmark Constituents
--
Shared Holdings
--
Shared Holdings Weight
--
cruisepandey
  • 6,860
  • 3
  • 9
  • 24