0

I wrote a small program in C# to Capture ingame Text. My issue is that the Text allso containts Collor Codes which i try to not to have. I read about the function Regex.Replace Which i think is going to suite for that.

I have Following String (Line) i want to clear i used the small little tool espresso to play a little bit with regular expression but i never figured it really out.

This is the String i am going to work with:

|c001177ffSave Code =|r |cff00AA00A|cff00AA00G|cff00AA00Q|cffff69b4g|r |cff00AA00R|cff40e0d09|cffffff00$|cffffff00#|r |cff40e0d04|cffff69b4f|cff00AA00R

I try to use ^|( [a-zA-Z0-9]{9})

which gave me theese matches c001177ff cff00AA00 cff00AA00 cff00AA00 cffff69b4 cff00AA00 cff40e0d0 cffffff00 cffffff00 cff40e0d0 cffff69b4 cff00AA00

Well i am not good at regex more likly i just started it. I don't want any body to present me completed solution (you are more than welcome to do that) at least a little help how i can solve that issue. I want to filter the Text.

Inpute Code

 |c001177ffSave Code =|r |cff00AA00A|cff00AA00G|cff00AA00Q|cffff69b4g|r |cff00AA00R|cff40e0d09|cffffff00$|cffffff00#|r |cff40e0d04|cffff69b4f|cff00AA00R

Should be Filtered to this

Save Code = AGQg R9$# 4fR

I think theese are Hexadecimal Color Codes the |c marks the beginning and the |r the End of the string.I think the |r | is just used to indicate that the first color string ends than we get an SPACE and the | indicates the next start.

Robert Harvey
  • 168,684
  • 43
  • 314
  • 475
Aviator
  • 25
  • 1
  • 3
  • 1
    does `string.Split('|')` + a few lines of code not work? Do you have to use Regex? – L.B Aug 07 '14 at 16:44

7 Answers7

2

How about a simple Linq?

var output = String.Join("", input.Split('|')
                             .Select(s => s.Length != 10 ? ' ' : s.Last()))
             .Trim();
EZI
  • 14,547
  • 2
  • 24
  • 31
1

So I think the problem you were having was not escaping your |... the following regex works for me:

var replaced = Regex.Replace(intput, @"\|c[0-9a-zA-Z]{8}|\|r", "");
  • \|c[0-9a-zA-Z]{8} - match starting with "|c" and then any 8 letters or numbers
  • | - or
  • \|r - match "|r"
dav_i
  • 25,285
  • 17
  • 94
  • 128
  • This defently was one of the Problems Thank you i keep that in mind that i have to escape | if i look for it – Aviator Aug 07 '14 at 16:55
1

You're on the right track. Your regex

^|( [a-zA-Z0-9]{9})

Both forces the match to be only at the start of your input string, due to the ^ start-of-line anchor, and the | needs to be escaped, because unescaped, it's a special "or" operator, which completely changes the meaning of your regex.

In addition, the space after the | is undesired, and the capture group is unnecessary, as you only want to eliminate this portion.

If you replace all instances of this

\|[a-zA-z0-9]{9}

with nothing (the empty string)

You will achieve most of your goal. Try it here: http://regex101.com/r/rF6yB6/1

But it seems you really want to eliminate not just nine characters after the pipe, but up through nine characters. So use the {1,9} range quantifier instead:

\|[a-zA-z0-9]{1,9}

Try it: http://regex101.com/r/rF6yB6/2

This seems to achieve your goal exactly.


Please consider bookmarking the Stack Overflow Regular Expressions FAQ for future reference.

Community
  • 1
  • 1
aliteralmind
  • 18,274
  • 16
  • 66
  • 102
0
string input = "[The example input from your question]";
string output = input.Replace("|r", "");
while (output.Contains("|c"))
    output = output.Remove(output.IndexOf("|c"), 10);
// output = "Save Code = AGQg R9$# 4fR"

I like this much more than using Regexes just because it's so much more clear to me.

Jashaszun
  • 8,903
  • 3
  • 21
  • 51
0
var str1 = "|c001177ffSave Code =|r |cff00AA00A|cff00AA00G|cff00AA00Q|cffff69b4g|r |cff00AA00R|cff40e0d09|cffffff00$|cffffff00#|r |cff40e0d04|cffff69b4f|cff00AA00R"
var str2 = Regex.Replace(str,@"\|(r|[a-zA-Z0-9]{9})","") //"Save Code = AGQg R9$# 4fR"
Mark Cidade
  • 94,042
  • 31
  • 216
  • 230
0

In addition to this answer re: escaping the "pipe" character, you're starting your regex with the caret (^) character. This matches the beginning of a line.

A correct regex would be:

\|c[0-9a-zA-Z]{8}
Community
  • 1
  • 1
Harrison Paine
  • 528
  • 4
  • 13
0

This regex should match all of the characters you want to remove:

([|]c([0-9]|[a-f]|[A-F]){8})|[|]r

Here's the breakdown...

The vertical pipe is an OR marker, so to search for it, place it in square brackets [ and ].

The parenthesis makes a set. So you're searching for ([|]c([0-9]|[a-f]|[A-F]){8}) OR [|]r which is all of your color codes OR |r.

Breakdown of the color codes is the set that begins with |c and is followed by the set of exactly 8 characters that can be 0 though 9 or a through f or A through F.

I tested it at RegexPal.com.

DeadZone
  • 1,508
  • 1
  • 13
  • 28