2

I would like to catch bold values in the string below that starts with "need" word, while words in other string that starts from "skip" and "ignored" must be ignored. I tried the pattern

need.+?(:"(?'index'\w+)"[,}])

but it found only first(ephasised) value. How I can get needed result using RegEx only?

"skip" : {"A":"ABCD123","B":"ABCD1234","C":"ABCD1235"}

"need" : {"A":"ZABCD123","B":"ZABCD1234","C":"ZABCD1235"}

"ignore" : {"A":"SABCD123","B":"SABCD1234","C":"SABCD1235"}

ΩmegaMan
  • 22,885
  • 8
  • 76
  • 94
managerger
  • 536
  • 1
  • 6
  • 20
  • Looks like JSON. You should parse it and work with it from there. – Matt Burland May 12 '17 at 16:51
  • I need to use only RegEx – managerger May 12 '17 at 16:52
  • Why? That's silly. Use the right tool for the job. – Matt Burland May 12 '17 at 16:53
  • I'd like to know whether such issue can be resolved with RegEx or not. I'm not interesting with another solutions, because I'm learning RE and like to know its possibilities. Json here just as example to provide main idea. – managerger May 12 '17 at 16:58
  • 1
    What's the difference between "skip" and "ignore" – Rufus L May 12 '17 at 17:40
  • @managerger: You should rewrite the question then, since JSON should not be parsed with regex, in C#, there is JSON.net for it. What you ask is possible, but only if the text you are searching for is between clear non-ambiguous non-identical boundaries. – Wiktor Stribiżew May 12 '17 at 18:13
  • The simplest way if you don't need too much structure is this `"need"\s*:\s*\{(?:[^{}]*?"(ZABCD[^"]*)")+[^{}]*?\}` then get the group 1 CaptureCollections for the list. I could post the actual code in an answer if you need it. –  May 12 '17 at 18:56

3 Answers3

1

If number of fields is fixed - you can code it like:

^"need"\s*:\s*{"A":"(\w+)","B":"(\w+)","C":"(\w+)"}

Demo

If tags would be after values - like that:

{"A":"ABCD123","B":"ABCD1234","C":"ABCD1235"} : "skip" {"A":"ZABCD123","B":"ZABCD1234","C":"ZABCD1235"} : "need" {"A":"SABCD123","B":"SABCD1234","C":"SABCD1235"} : "ignore"

Then you could employ infinite positive look ahead with

"\w+?":"(\w+?)"(?=.*"need")

Demo

But infinite positive look behind's are prohibited in PCRE. (prohibited use of *+ operators in look behind's syntax). So not very useful in your situation

Community
  • 1
  • 1
Agnius Vasiliauskas
  • 10,413
  • 5
  • 46
  • 66
  • Number of fields isn't fixed. So, the next answer is most appropriate. It allows to find an arbitrary quantity of values. Thanks) – managerger May 15 '17 at 08:27
1

We are going find need and group what we find into Named Match Group => Captures. There will be two groups, one named Index which holds the A | B | C and then one named Data.

The match will hold our data which will look like this:

enter image description here

From there we will join them into a dictionary:

enter image description here

Here is the code to do that magic:

string data =
@"""skip"" : {""A"":""ABCD123"",""B"":""ABCD1234"",""C"":""ABCD1235""}
""need"" : {""A"":""ZABCD123"",""B"":""ZABCD1234"",""C"":""ZABCD1235""}
""ignore"" : {""A"":""SABCD123"",""B"":""SABCD1234"",""C"":""SABCD1235""}";

string pattern = @"
\x22need\x22\s *:\s *{   # Find need
(                        # Beginning of Captures
   \x22                     #  Quote is \x22
   (?<Index>[^\x22] +)      # A into index.
   \x22\:\x22               # ':'
   (?<Data>[^\x22] +)       # 'Z...' Data
   \x22,?                   # ',(maybe)
)+                       # End of 1 to many Captures";


var mt = Regex.Match(data, 
                     pattern, 
                     RegexOptions.IgnorePatternWhitespace | RegexOptions.ExplicitCapture);

// Get the data capture into a List<string>.
var captureData = mt.Groups["Data"].Captures.OfType<Capture>()
                                            .Select(c => c.Value).ToList();

// Join the index capture data and project it into a dictionary.
var asDictionary = mt.Groups["Index"]
                     .Captures.OfType<Capture>()
                     .Select((cp, iIndex) => new KeyValuePair<string,string>
                                                 (cp.Value, captureData[iIndex]) )
                     .ToDictionary(kvp => kvp.Key, kvp => kvp.Value );
ΩmegaMan
  • 22,885
  • 8
  • 76
  • 94
0

You can't capture a dynamically set number of groups, so I'd run something like this regex

"need".*{.*,?".*?":(".+?").*}

[Demo]

with a 'match_all' function, or use Agnius' suggestion

Community
  • 1
  • 1
Dotan
  • 4,312
  • 3
  • 27
  • 41