0

I have the following code that grabs a div element:

For Each ele As HtmlElement In WebBrowser1.Document.GetElementsByTagName("div")
        If ele.GetAttribute("className").Contains("description") Then
            Dim content As String = ele.InnerHtml
            If content.Contains("http://myserver.com/image/check.png") Then
    'Do stuff if image exists
            Else
    'Do stuff if image doesn't exist
            End If
        End If

The div element looks like this:

<DIV class=headline><SPAN class=blue-title-lg>TITLE_HERE
</SPAN>&nbsp;&nbsp;&nbsp;&nbsp;LOCATION1_HERE,&nbsp;LOCATION2_HERE</DIV>DESCRIPTION_HERE<BR>
<DIV class=about><A class=link href="viewprofile.aspx?
profile_id=00000000">USERNAME</A>&nbsp;20&nbsp;&nbsp;&nbsp;&nbsp;FSM - 
Friends&nbsp;&nbsp;&nbsp;<FONT color=green>Online Today</FONT></DIV>

When the tick image doesn't exist, I want to grab the url that's in:

<a class=link href="viewprofile.aspx?profile_id=00000000"></a>

and put it into a string. This is where I've hit a brick wall and I need some help. I'd imagine a regex solution would resolve my issue, but regex is one of my weak spots. Can someone put me out of my misery?

user3122456
  • 43
  • 1
  • 7
  • 2
    you should be using HTML Agility Pack – Daniel A. White Feb 23 '14 at 01:25
  • possible duplicate of [RegEx match open tags except XHTML self-contained tags](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – John Saunders Feb 23 '14 at 01:28
  • Hi John, had a look at the link you provided, can you please kindly explain why you think it's a duplicate? – user3122456 Feb 23 '14 at 01:36
  • People see the word regex and html in one place and instantly begi. pontificating. I will take a look at this in about one hour and try to answer it. – Vasili Syrakis Feb 23 '14 at 03:41
  • You've adequately described what you want to accomplish, but you have not told us what part you are having trouble implementing. Do you need help getting the URL from the html element? Do you need help downloading the content? – Sam Axe Feb 23 '14 at 07:22
  • Hi Dan-o, the bit I'm having trouble with is grabbing the viewprofile.aspx?profile_id=00000000 link from the div element. Thanks in advance! – user3122456 Feb 23 '14 at 10:42
  • [This is tricky](http://stackoverflow.com/a/4234491/471272). – tchrist Jun 06 '14 at 22:47

1 Answers1

0

Solved it!

I slept on it and came up with a really simple way of solving it. The UI of my app now looks like a mess, but I'll sort that later. I have the information I need.

Here's how I did it:

    Dim PageElement As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("a")
    For Each CurElement As HtmlElement In PageElement
        Dim linkunverified As String
        linkunverified = CurElement.GetAttribute("href")
        If linkunverified.Contains("viewprofile.aspx") Then
            If ListBox1.Items.Contains(linkunverified) Then
            Else
                ListBox1.Items.Add(linkunverified)
            End If

        End If


    Next

    For Each ele As HtmlElement In WebBrowser1.Document.GetElementsByTagName("div")
        If ele.GetAttribute("className").Contains("description") Then
            Dim content As String = ele.InnerHtml
            If content.Contains("http://pics.myserver.com/image/check.png") Then



            Else

                Dim i As Integer

                For i = 0 To ListBox1.Items.Count - 1
                    If content.Contains(ListBox1.Items(i).Remove(0, 24)) Then
                        ListBox2.Items.Add("http://www.myserver.com/" & ListBox1.Items(i).Remove(0, 24))
                    End If
                Next

            End If



            End If

    Next
user3122456
  • 43
  • 1
  • 7