1

I try to scrape the top 20 holder of a token on the ERC-20 chain. I use for that selenium. It seems like the xpath's dont load/didnt have enough time?

I try to load this page: https://etherscan.io/token/0xdac17f958d2ee523a2206206994597c13d831ec7#balances

I tried it with implicit wait and with explicit wait. I can even see, when I run the webdriver that the side is load, but it never found the path...

Code with explicity wait:

options = Options()
ptions.add_argument("--disable-dev-shm-using")
options.add_argument("--no-sandbox")
driver = webdriver.Chrome(chrome_options=options)
driver.get("https://etherscan.io/token/0xdac17f958d2ee523a2206206994597c13d831ec7#balances")
wait = WebDriverWait(driver, 10, poll_frequency=1)
wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="maintable"]/div[3]/table/tbody/')))

Error:

selenium.common.exceptions.TimeoutException: Message:

Yep not even a message...

Code with implicit:

options = Options()
ptions.add_argument("--disable-dev-shm-using")
options.add_argument("--no-sandbox")
driver = webdriver.Chrome(chrome_options=options)
driver.implicitly_wait(10)
driver.get("https://etherscan.io/token/0xdac17f958d2ee523a2206206994597c13d831ec7#balances")
for i in range(1,20):
            req = driver.find_element_by_xpath('//*[@id="maintable"]/div[3]/table/tbody/tr['+str(i)+']/td[2]/span/a')
            

Error:

selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//*[@id="maintable"]/div[3]/table/tbody/tr[1]/td[2]/span/a"}

So like I say it looks like the driver has not enough time to load the page but even with 20,30,... secounds they dont find the path.

Also when I copy the xpath from the browser opened by the script I can find the xpath.

DebanjanB
  • 118,661
  • 30
  • 168
  • 217
M Token
  • 103
  • 8

2 Answers2

0

The Table is present inside an iframe you need to switch to iframe first to access the table.

Induce WebDriverWait() and wait for frame_to_be_available_and_switch_to_it()

Induce WebDriverWait() and wait for visibility_of_all_elements_located()

code:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

driver=webdriver.Chrome()
driver.get("https://etherscan.io/token/0xdac17f958d2ee523a2206206994597c13d831ec7#balances")
WebDriverWait(driver,10).until(EC.frame_to_be_available_and_switch_to_it((By.ID,"tokeholdersiframe")))
elements=WebDriverWait(driver,20).until(EC.visibility_of_all_elements_located((By.XPATH,'//*[@id="maintable"]/div[3]/table/tbody//tr/td[2]//a')))
for ele in elements:
    print(ele.get_attribute('href'))

If you want to fetch first 20 token then use this.

for ele in elements[:20]:
    print(ele.get_attribute('href'))
KunduK
  • 26,790
  • 2
  • 10
  • 32
  • Thank you but another very strange think happend ... if I dont have this line 2 times: driver.get("https://etherscan.io/token/0xdac17f958d2ee523a2206206994597c13d831ec7#balances") driver.get("https://etherscan.io/token/0xdac17f958d2ee523a2206206994597c13d831ec7#balances") I get the same error like on the other answere...: selenium.common.exceptions.NoSuchWindowException: Message: no such window – M Token Jul 14 '20 at 16:38
0

To scrape the top 20 holder of a token on the ERC-20 chain as the Holders information is within an <iframe> so you have to:

  • scrollIntoView the Token Holders Chart

  • Induce WebDriverWait for the desired frame to be available and switch to it.

  • Induce WebDriverWait for the desired visibility_of_all_elements_located().

  • You can use the following based Locator Strategies:

    driver.get("https://etherscan.io/token/0xdac17f958d2ee523a2206206994597c13d831ec7#balances")
    driver.execute_script("arguments[0].scrollIntoView(true);", WebDriverWait(driver, 60).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='card']"))))
    WebDriverWait(driver,10).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH,"//iframe[@id='tokeholdersiframe']")))
    print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@class='table table-md-text-normal table-hover']//tbody//tr//td[./span]/span/a")))[:20]])
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
  • Console Output:

    ['0x7b8c69a0f660cd43ef67948976daae77bc6a019b', 'Binance 7', '0x5754284f345afc66a98fbb0a0afe71e0f007b949', 'Binance', 'Huobi 9', 'Bittrex 3', '0xd545f6eaf71b8e54af1f02dafba6c0d46c491cc1', '0x778476d4c51f93078d61e51c978f90b4a6e500af', 'Bitfinex 2', '0x5041ed759dd4afc3a72b8192c143f72f4724081a', '0xd30b438df65f4f788563b2b3611bd6059bff4ad9', '0x570aeda18a21d8fff6d28a5ef34164553cf9cb77', '0x2b9dc5aaf7b1c15f1fd8aba255919c2a7a184453', '0x6a5b1111a0b5ea8c7ec5665ba09cbacd7fde2b96', 'Gate.io 1', '0x9ec7d40d627ec59981446a6e5acb33d51afcaf8a', '0x231568baa78111377f097bb087241f8379fa18f4', '0xd33547964bae70e1ddd2863a4770dc5cffd86269', 'Huobi 3', 'Compound Tether']
    
DebanjanB
  • 118,661
  • 30
  • 168
  • 217
  • Hi thank you but get this exception for the code: selenium.common.exceptions.NoSuchWindowException: Message: no such window But the other answer works – M Token Jul 14 '20 at 16:28
  • @MToken There was no window switching in my code you shouldn't see `NoSuchWindowException` anyway. keep your libraries updated. – DebanjanB Jul 14 '20 at 16:36
  • What can I say... when i habe two times driver.get(url)... inside of my code it also work with you solution ... But if not: selenium.common.exceptions.NoSuchWindowException: Message: no such window – M Token Jul 14 '20 at 16:47
  • @MToken You haven't told us about your usecase in details, however I will suggest you to refrain from using `--no-sandbox` and `--disable-dev-shm-using` unless necessary. Those are used in different circumstances. – DebanjanB Jul 14 '20 at 16:54