6

By using Regular Expressions how can I extract all text in double quotes, and all words out of quotes in such string:

01AB "SET 001" IN SET "BACK" 09SS 76 "01 IN" SET

First regular expression should extract all text inside double quotes like

SET 001
BACK
01 IN

Second expression shoud extract all other words in string

01AB
IN
SET
09SS
76
SET

For the first case works fine ("(.*?)"). How can I extract all words out of quotes?

mbigun
  • 1,254
  • 4
  • 19
  • 43
  • Check this [link](http://stackoverflow.com/questions/9133220/regex-matches-c-sharp-double-quotes) its almost same as yours – andy Sep 22 '12 at 10:55

5 Answers5

5

Try this expression:

(?:^|")([^"]*)(?:$|")

The groups matched by it will exclude the quotation marks, because they are enclosed in non-capturing parentheses (?: and ). Of course you need to escape the double-quotes for use in C# code.

If the target string starts and/or ends in a quoted value, this expression will match empty groups as well (for the initial and for the trailing quote).

Sergey Kalinichenko
  • 675,664
  • 71
  • 998
  • 1,399
4

Try this regex:

\"[^\"]*\"

Use Regex.Matches for texts in double quotes, and use Regex.Split for all other words:

var strInput = "01AB \"SET 001\" IN SET \"BACK\" 09SS 76 \"01 IN\" SET";
var otherWords = Regex.Split(strInput, "\"[^\"]*\"");
Ria
  • 9,576
  • 3
  • 29
  • 55
2

Maybe you can try replacing the words inside quotes with empty string like:

Regex r = new Regex("\".*?\"", RegexOptions.CultureInvariant | RegexOptions.Compiled | RegexOptions.Singleline);
        string p = "01AB \"SET 001\" IN SET \"BACK\" 09SS 76 \"01 IN\" SET";

        Console.Write(r.Replace(p, "").Replace("  "," "));
Agent007
  • 2,596
  • 3
  • 17
  • 24
1

You need to negate the pattern in your first expression.

(?!pattern)

Check out this link.

Community
  • 1
  • 1
opaque
  • 364
  • 1
  • 10
1

If suggest you need all blocks of sentence - quoted and not ones - then there is more simple way to separate source string by using Regex.Split:

static Regex QuotedTextRegex = new Regex(@"("".*?"")", RegexOptions.IgnoreCase | RegexOptions.Compiled);

var result = QuotedTextRegex
                .Split(sourceString)
                .Select(v => new
                    {
                        value = v,
                        isQuoted = v.Length > 0 && v[0] == '\"'
                    });
vladimir
  • 8,809
  • 2
  • 23
  • 47