1

I would like to extract the bio of a guy ("John Reinsberg is Deputy Chairman of Lazard Asset Management responsible for oversight...") from this web page:https://www.morningstar.com/funds/xnas/lziex/people

See the picture for example

My codes don't work because the contents are in a pop-up window. From some existing questions, it seems that I need to use click() and then find element from the window. However, I do not know how to locate the element to click. Thanks.

from selenium import webdriver

options = webdriver.ChromeOptions()
options.add_argument('headless')
driver = webdriver.Chrome(options=options)
driver.get('https://www.morningstar.com/funds/xnas/lziex/people')
element=driver.find_elements_by_xpath('//*[@class="sal-modal-biography ng-binding ng-scope"]')
print(element.text) 

I also tried but it didn't work:

element =  driver.find_element_by_xpath("//button[@class='sal-icons sal-icons--close mds-button mds-button--icon-only']")
driver.execute_script('arguments[0].click();',element)

driver.switch_to_alert()
print(driver.find_elements_by_xpath('//*[@class="sal-modal-biography ng-binding ng-scope"]'))

Here are part of the HTML.

<div class="sal-component-ctn sal-modal-scrollable" style="display: block;" aria-hidden="true"><div class="sal-component-mip-manager-pop-out reveal-modal mds-modal ng-isolate-scope open" data-reveal="" manager-data="vm.managerData" style="display: block; opacity: 1; visibility: visible; top: 335.333px;" tabindex="0" aria-hidden="false">
    <div class="sal-row">
        <div class="sal-manager-modal">
            <div class="sal-manager-modal__modalHeader" ng-class="{'sal-fixed':vm.fixedHeader}" ng-style="vm.headerStyle" style="height: auto; width: auto;">
                <span class="sal-modal-header__menu">
                    <button class="sal-icons sal-icons--close mds-button mds-button--icon-only" type="button">
                        <svg class="mds-icon mds-button__icon mds-button__icon--left">
                            <use xlink:href="#remove">
                                <title class="ng-binding">Close</title>
                            </use>
                        </svg>
                    </button>
                </span>
                <div class="sal-modal-header__title ng-binding">
                    John R. Reinsberg
                </div>
            </div>
            <div class="sal-manager-modal__body" ng-style="{'margin-top': vm.headerStyle.height}" style="margin-top: auto;">
                <div class="sal-modal-dps">
                    <ul class="sal-xsmall-block-grid-2 small-block-grid-3 medium-block-grid-5 large-block-grid-5">
                                      </ul>
                </div>
                <!-- ngIf: vm.managerModalData.fundManager.biography.managerProvidedBiography || (vm.managerModalData.fundManager.CollegeEducationDetailList && vm.managerModalData.fundManager.CollegeEducationDetailList.length > 0) --><div class="sal-columns sal-small-12 sal-medium-6 sal-large-6 ng-scope" ng-if="vm.managerModalData.fundManager.biography.managerProvidedBiography || (vm.managerModalData.fundManager.CollegeEducationDetailList &amp;&amp; vm.managerModalData.fundManager.CollegeEducationDetailList.length > 0)" ng-class="{'sal-medium-12 sal-large-12': !vm.managerModalData.currentManagedFundList || vm.managerModalData.currentManagedFundList.length === 0}">
                    <!-- ngIf: vm.managerModalData.fundManager.biography.managerProvidedBiography --><div class="sal-modal-biography ng-binding ng-scope" ng-if="vm.managerModalData.fundManager.biography.managerProvidedBiography">
                        <!-- ngIf: !vm.managerModalData.fundManager.biography.isLocalized -->
                        John Reinsberg is Deputy Chairman of Lazard Asset Management responsible for oversight of the firm's international and global strategies. He is also a Portfolio Manager/Analyst on the Global Equity and International Equity portfolio teams. He began working in the investment field in 1981. Prior to joining Lazard in 1992, John was Executive Vice President with General Electric Investment Corporation and Trustee of the General Electric Pension Trust.
                    </div><!-- end ngIf: vm.managerModalData.fundManager.biography.managerProvidedBiography -->

                    </div>
                </div>
            </div>
        </div>
    </div>
</div></div>
DebanjanB
  • 118,661
  • 30
  • 168
  • 217
AlanZ
  • 21
  • 3

1 Answers1

0

To extract the bio of "John Reinsberg is Deputy Chairman of Lazard Asset Management responsible for oversight..." from the web page https://www.morningstar.com/funds/xnas/lziex/people you need to induce WebDriverWait for the element_to_be_clickable() and you can use the following Locator Strategies:

  • Code Block:

    options = webdriver.ChromeOptions() 
    options.add_argument("start-maximized")
    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    options.add_experimental_option('useAutomationExtension', False)
    driver = webdriver.Chrome(options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
    driver.get("https://www.morningstar.com/funds/xnas/lziex/people")
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[@class='sal-management-team__memberName']/a//span[text()='Reinsberg']/.."))).click()
    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.sal-modal-biography.ng-binding.ng-scope"))).text.strip())
    
  • Console Output:

    John Reinsberg is Deputy Chairman of Lazard Asset Management responsible for oversight of the firm's international and global strategies. He is also a Portfolio Manager/Analyst on the Global Equity and International Equity portfolio teams. He began working in the investment field in 1981. Prior to joining Lazard in 1992, John was Executive Vice President with General Electric Investment Corporation and Trustee of the General Electric Pension Trust.
    
DebanjanB
  • 118,661
  • 30
  • 168
  • 217
  • Thank you so much! I tried the code but the window went to an advertisement page with "continue to website" on top right. Python also gave the following error: ElementClickInterceptedException: element click intercepted: Element ... is not clickable at point (62, 78). Other element would receive the click:
    ...
    (Session info: chrome=79.0.3945.79)
    – AlanZ Dec 19 '19 at 11:43
  • @AlanZ Unfortunately, I don't find any element identified as `
    ...
    ` within the HTML of https://www.morningstar.com/funds/xnas/lziex/people Not even any element with class as `mdc-intro-ad`
    – DebanjanB Dec 19 '19 at 11:48
  • I think I probably need to add extension like AdBlock. Do you know how to add the argument to use extension? – AlanZ Dec 19 '19 at 11:56
  • @AlanZ That should be a separate question with a all together different answer :) – DebanjanB Dec 19 '19 at 11:57
  • It always went to the ad (url doesn't change). How can I avoid such ad? – AlanZ Dec 19 '19 at 12:45
  • The ad is not visible/reproducible at my end. However your question wasn't related to **ad** exactly but related to _...I need to use click() and then find element from the window..._ – DebanjanB Dec 19 '19 at 12:47
  • I changed VPN and the code finally worked out. Thanks so much! May I have a follow-up question? Can you please help me find the bio of another guy "Herbert W. Gullquist" from "Manager Timeline"? I tried to revise your code but failed. – AlanZ Dec 19 '19 at 15:27
  • @AlanZ Feel free to raise a new question as per your new requirement. StackOverflow volunteers will be happy to help you out. – DebanjanB Dec 19 '19 at 15:30
  • Thank you so much for your time and being extremely helpful!!! I just raised a new question as you suggested. – AlanZ Dec 20 '19 at 03:32
  • @AlanZ [Upvote](https://stackoverflow.com/help/why-vote) the answer if this/any answer is/was helpful to you for the benefit of the future readers. – DebanjanB Dec 20 '19 at 07:17