2

First of all: Sorry for my bad English!

I know the title isn't the best English, but I don't really know how to format this question...
What I'm trying to do is reading an HTML source line by line so when it sees a given word (like http://) it copies the entire sentence so I can strip the rest an only keep the URL.

This is what I've tried:

using (var source = new StreamReader(TempFile))
{
    string line;
    while ((line = source.ReadLine()) != null)
    {
        if (line.Contains("http://"))
        {
            Console.WriteLine(line);
        }
    }
}

This works perfectly if I want to read it from an external file but it doesn't work when I want to read an string or stringbuilder, how do you read those line by line?

puretppc
  • 3,096
  • 7
  • 34
  • 63
Yuki Kutsuya
  • 3,658
  • 8
  • 41
  • 56

5 Answers5

6

You can use new StringReader(theString) to do that with a string, but I question your overall strategy. That would be better done with a tool like HTML Agility Pack.

For example, here is HTML Agility Pack extracting all hyperlinks:

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(theString);
foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href]")
{
   HtmlAttribute att = link["href"];
   Console.WriteLine(att.Value);
}
Marc Gravell
  • 927,783
  • 236
  • 2,422
  • 2,784
0

Well a string is just a string, it doesn't have any lines.

You can use something like String.Split to separate on the \r symbol.

MSDN: String.Split()

string words = "This is a list of words, with: a bit of punctuation" +
                       "\rand a newline character.";

string [] split = words.Split(new Char [] {'\r' });

foreach (string s in split) {
    if (s.Trim() != "")       
        Console.WriteLine(s);
}
Only Bolivian Here
  • 32,571
  • 60
  • 151
  • 250
0

Firstly, you can use a StringReader.

Another option is to create a MemoryStream from the string via converting the string to a byte array first, as described in https://stackoverflow.com/a/10380166/396583

Community
  • 1
  • 1
vines
  • 4,959
  • 1
  • 24
  • 47
0

I think you can tokenize the input and check each entry for the required content.

 string[] info = myStringBuilder.toString().split[' '];
 foreach(var item in info) {
 if(item.Contains('http://') {
    //work with it
    }
 }
vishakvkt
  • 834
  • 6
  • 7
0

You can use a memory stream to read from.

SargeATM
  • 554
  • 2
  • 11