0

I am learning web scraping using a testing website created for that purpose and everything is working fine, for example given this HTML:

<small class="author" itemprop="author">Albert Einstein</small>

I issue the following command and get True:

PS C:\Users\user> $HTML -match '<span class="text" itemprop="text">.*</span>'
True

And I also get True when trying to match text within tags from this HTML:

<a href="/" style="text-decoration: none">Quotes to Scrape</a>

Here's the command I used:

PS C:\Users\user> $HTML -match '<a href=".*">.*</a>'
True

However, when trying to match inner text between certain tags from another website and using seemingly similar logic, I get False. The HTML looks like this:

<span class="LdapSubMenu">
    <a href="certainHyperLink">
            CertainClientName
         </a>
</span>

And the command I issued:

PS C:\Users\user> $HTML -match '<span class="LdapSubMenu"><a href=".*">.*</a></span>'
False

Given the above mentioned tests which returned True, why my attempt to extract text from "LdapSubMenu" class fails? Does it have something to do with how the elements are nested?

Thank you.

Community
  • 1
  • 1
kamokoba
  • 413
  • 6
  • 14

0 Answers0