27

How do I match and replace text using regular expressions in multiline mode?

I know the RegexOptions.Multiline option, but what is the best way to specify match all with the new line characters in C#?

Input:

<tag name="abc">this
is
a
text</tag>

Output:

[tag name="abc"]this
is
a
test
[/tag]

Aahh, I found the actual problem. '&' and ';' in Regex are matching text in a single line, while the same need to be escaped in the Regex to work in cases where there are new lines also.

Peter Mortensen
  • 28,342
  • 21
  • 95
  • 123
Priyank Bolia
  • 12,942
  • 13
  • 57
  • 81

2 Answers2

77

If you mean there has to be a newline character for the expression to match, then \n will do that for you.

Otherwise, I think you might have misunderstood the Multiline/Singleline flags. If you want your expression to match across several lines, you actually want to use RegexOptions.Singleline. What it means is that it treats the entire input string as a single line, thus ignoring newlines. Is this what you're after...?

Example

Regex rx = new Regex("<tag name=\"(.*?)\">(.*?)</tag>", RegexOptions.Singleline);
String output = rx.Replace("Text <tag name=\"abc\">test\nwith\nnewline</tag> more text...", "[tag name=\"$1\"]$2[/tag]");
Ian Kemp
  • 24,155
  • 16
  • 97
  • 121
David Hedlund
  • 121,697
  • 28
  • 196
  • 213
  • I want to match some text that may contain new line characters also, but not necessarily. Now, if you say with RegexOptions.Singleline, then can you show me an example on how to do this. – Priyank Bolia Nov 22 '09 at 21:23
  • see my edit for an example. I seem to have been mistaken about it being default tho, when I tested it... manually specifying singleline as in the example, makes the example work, tho – David Hedlund Nov 22 '09 at 21:40
  • thanks, it looks like the problem was different, but thanks for clearing my doubts about multiline mode. – Priyank Bolia Nov 22 '09 at 21:48
  • 1
    Ah! I was banging my head after using the .MultiLine enum value! It wasn't matching over multiple lines that way :DOH: Thanks very much for the explanation! Guess I should have read the doc tooltip in VS first . . . – Kirtan Dec 24 '09 at 11:39
14

Here's a regex to match. It requires the RegexOptions.Singleline option, which makes the . match newlines.

<(\w+) name="([^"]*)">(.*?)</\1>

After this regex, the first group contains the tag, the second the tag name, and the third the content between the tags. So replacement string could look like this:

[$1 name="$2"]$3[/$1]

In C#, this looks like:

newString = Regex.Replace(oldString, 
    @"<(\w+) name=""([^""]*)"">(.*?)</\1>", 
    "[$1 name=\"$2\"]$3[/$1]", 
    RegexOptions.Singleline);
Andomar
  • 216,619
  • 41
  • 352
  • 379
  • 1
    +1: Very good code and explanation. @Priyank Bolia: Just remember that this works only if s can't be nested. If they *can* be, then regular expressions will fail you. – Tim Pietzcker Nov 22 '09 at 21:40
  • Thanks for the excellent example, i figured out, it was some other issue though. – Priyank Bolia Nov 22 '09 at 21:49