0

This is the html snippet I am trying to web scrape using Python Selenium.

I am trying to get the text Add to bag which is inside a span data-bind.

<div class="is-add-item-saving" data-bind="visible: isBusy" style="display: none;"></div>
<span class="aria-live" aria-role="status" aria-live="polite" data-bind="{ text: ariaLiveText }"></span>
<button data-bind="click: addToBag, css : buttonCss, attr: { 'aria-label': resources.pdp_cta_add_to_bag, disabled: isBusy }, markAndMeasure: 'pdp:add_to_bag_interactive'" data-test-id="add-button" aria-label="Add to bag">
    <span class="product-tick" data-bind="visible: showProductTick" style="display: none;"></span>
    <span data-bind="text: buttonText">Add to bag</span>

</button>

This is what I have tried so far.

instock_element = driver.find_elements_by_xpath("//span[contains(@data-bind,'text: buttonText')]")
instock_element = driver.find_elements_by_xpath("//*[contains(text(), 'Add to bag')]")

When I iterate over these instock_elements,

for value in instock_element:
     print("text : ",value.text)
     print(" id : ",value.id)
     if len(value.text) == 0:
          text = value.id
     else:
          print(value.text)
          text = value.text
          ins_list.append(text)

These are giving me random values like 6489355d-9dd3-4d77-a0d7-b134ce48fae7 but not the text Add to bag.

DebanjanB
  • 118,661
  • 30
  • 168
  • 217
  • What is your actual problem? This should succeed as you're using the right xpath to get that span (and it's text). – DMart Jan 07 '21 at 15:18

5 Answers5

0

Try this (in particular the xpath):

from lxml import html

sample = """<div class="is-add-item-saving" data-bind="visible: isBusy" style="display: none;"></div>
<span class="aria-live" aria-role="status" aria-live="polite" data-bind="{ text: ariaLiveText }"></span>
<button data-bind="click: addToBag, css : buttonCss, attr: { 'aria-label': resources.pdp_cta_add_to_bag, disabled: isBusy }, markAndMeasure: 'pdp:add_to_bag_interactive'" data-test-id="add-button" aria-label="Add to bag">
    <span class="product-tick" data-bind="visible: showProductTick" style="display: none;"></span>
    <span data-bind="text: buttonText">Add to bag</span>

</button>"""

print(html.fromstring(sample).xpath("//*[@data-bind='text: buttonText']/text()"))

Output:

['Add to bag']
baduker
  • 12,203
  • 9
  • 22
  • 39
0

https://www.selenium.dev/selenium/docs/api/py/webdriver_remote/selenium.webdriver.remote.webelement.html#module-selenium.webdriver.remote.webelement

id

Internal ID used by selenium.

This is mainly for internal use. Simple use cases such as checking if 2 webelements refer to the same element, can be done using ==:

if element1 == element2: print("These 2 are equal")

use value.get_attribute("id") instead to get id

to get text use:

value.text

if it fails use:

value.get_attribute("textContent")

as value.text retrieves only text that is displayed in UI

PDHide
  • 10,919
  • 2
  • 12
  • 26
0

To print the text Add to bag you can use either of the following Locator Strategies:

  • Using css_selector and get_attribute("innerHTML"):

    print(driver.find_element_by_css_selector("button[data-test-id='add-button'][aria-label='Add to bag'] span").get_attribute("innerHTML"))
    
  • Using xpath and text attribute:

    print(driver.find_element_by_xpath("//button[@data-test-id='add-button' and @aria-label='Add to bag']//span").text)
    

Ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR and text attribute:

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "button[data-test-id='add-button'][aria-label='Add to bag'] span"))).text)
    
  • Using XPATH and get_attribute():

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//button[@data-test-id='add-button' and @aria-label='Add to bag']//span"))).get_attribute("innerHTML"))
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python


References

Link to useful documentation:

DebanjanB
  • 118,661
  • 30
  • 168
  • 217
0

If you are having hard time finding the right element, Easy way is to instead of finding all the elements of associated with xpath, You must use the full xpath of the individual tag and then get the text of it using .text

Example: text = driver.find_element_by_xpath("full xpath of the element").text

-1

You may also use BeautifulSoup for this :

from bs4 import BeautifulSoup

html = """<div class="is-add-item-saving" data-bind="visible: isBusy" style="display: none;"></div>
<span class="aria-live" aria-role="status" aria-live="polite" data-bind="{ text: ariaLiveText }"></span>
<button data-bind="click: addToBag, css : buttonCss, attr: { 'aria-label': resources.pdp_cta_add_to_bag, disabled: isBusy }, markAndMeasure: 'pdp:add_to_bag_interactive'" data-test-id="add-button" aria-label="Add to bag">
    <span class="product-tick" data-bind="visible: showProductTick" style="display: none;"></span>
    <span data-bind="text: buttonText">Add to bag</span>
</button>"""

soup = BeautifulSoup(html)

tag = soup.find('span',{'data-bind':'text: buttonText'})
print(tag.text)

Output

Add to bag
Sebastien D
  • 3,945
  • 3
  • 14
  • 38
  • No need to mix beautifulSoup and built in selenium functions imo. – DMart Jan 07 '21 at 15:15
  • Those are two different tools imo – Sebastien D Jan 08 '21 at 08:20
  • They are two different tools. But they share a general purpose: parsing a DOM. Selenium already does that with findElements & BS with find. I see BS more for when you're using requests module and need a tool to parse from text. – DMart Jan 08 '21 at 15:47
  • Selenium is not specifically a DOM parser, which BS is. The combined use of those two tools is absolutely relevant to me. – Sebastien D Jan 08 '21 at 16:28
  • I disagree. I would love to see an example where the use of the two tools makes sense. – DMart Jan 09 '21 at 20:34