0

Lets say I have some text in a file

AAAA k1="123" k2="456"
several lines of other stuff
AAAA k1="789" k2="101"
AAAA k1="121" k2="141"

The goal is to capture the k1 and k2 values, but keeping the groupings together. So the first match would return groups with 123 and 456, and the second match would return groups with 789 and 101, and 121 and 141.

I can write the regex to get any single line, or even match all the relevant lines in the file, but can't figure out how to keep the matches in groups.

The hardest thing about this is the number of lines that begin with AAAA are not constant across groups, for example that might be 1 AAAA line, then some other lines, then 4 AAAA lines, and so on.

EDIT -- Ok to clarify, the various values need to be kept separate by group.

So the first set of AAAA lines only has one line, so I expect the values 123 and 456.

The second set of AAAA lines has 2 lines, so I need the values 789,101, 121, and 141. Moreover, I need to know the 789 and 101 are associated (came from the same line), and the 121 and 141 are associated (came from the same line), but still all part of the second group (not in any way associated with the 123 and 456)

Eventually I want to get to objects (javascript) such as

{ '123': '456'}

and

 {
    '789': '101',
    '121': '141
 }

If there were 15 AAAA lines in a row, that object would have 15 key value pairs.

Yevgen Gorbunkov
  • 12,646
  • 3
  • 13
  • 31
hvgotcodes
  • 109,621
  • 25
  • 195
  • 231
  • `/^AAAA k1="(\d+)" k2="(\d+)"$/gm` how does this serve you? I’m not entirely sure I understand your problem. – hackape May 04 '20 at 19:52
  • @anubhava that wouldn't keep the groupings together – hvgotcodes May 04 '20 at 19:53
  • @hackape that wouldn't keep the groupings across lines together. The key point is there `AAAA` lines come in groups, there might be 1 such line, there might be 4 such lines, and need to keep the values together. – hvgotcodes May 04 '20 at 19:54
  • @anubhava I need to keep the `123` and `456` values separate from the `789` and subesequent values – hvgotcodes May 04 '20 at 19:59
  • Still don’t follow. Can you write some input and expected output examples? Like “expect fn(2) == 4, fn(6) == 12” and ppl can give you a working `fn()` against these test cases. – hackape May 04 '20 at 20:03
  • @hackape Edited, hopefully it's clearer. – hvgotcodes May 04 '20 at 20:13
  • 1
    @anubhava, edited, hopefully it's clearer – hvgotcodes May 04 '20 at 20:14
  • This needs to be done in 2 steps. First your group all AAAA lines together then match `k1` and `k2` values. – anubhava May 04 '20 at 20:23
  • @YevgenGorbunkov how about now? – hvgotcodes May 04 '20 at 20:24
  • @YevgenGorbunkov matches coming from adjacent lines are grouped together in the objects -- no other criteria. The only restriction is that the groups of `AAAA` lines are kept distinct in their object representations from other such groups that are separated by non AAAA lines in the file. – hvgotcodes May 04 '20 at 20:32

3 Answers3

0

You may use this 2 phase approach. First regex is to capture all the lines starting with AAAA\s+ and group them together and second regex grabs k1 and k2 values:

const re1 = /(?:^AAAA\s+.*\n?)+/gm;
const re2 = /\s+k1="([^"]+)"\s+k2="([^"]+)"/g;

const str = `AAAA k1="123" k2="456"
several lines of other stuff
AAAA k1="789" k2="101"
AAAA k1="121" k2="141"`;
let m1;
let m2;
let result = [];

while ((m1 = re1.exec(str)) !== null) {
  var grpMap = {};
  while ((m2 = re2.exec(m1[0])) !== null)
    grpMap[m2[1]] = m2[2]
  result.push( grpMap );
}

console.log( result );
anubhava
  • 664,788
  • 59
  • 469
  • 547
0

You may do the following:

The proof of a concept live-demo you may find below:

const src = `AAAA k1="123" k2="456"
            several lines of other stuff
            AAAA k1="789" k2="101"
            AAAA k1="121" k2="141"`,

      result = src
        .split("\n")
        .map(line => {
          const matches = line.match(/AAAA k1=\"(\d+)\" k2=\"(\d+)\"/)
          return matches ? {[matches[1]]:matches[2]} : null
        })
        .reduce((r,o,i,s) => 
          (o && (!i || !s[i-1]) ? r.push(o) : Object.assign(r[r.length-1], o), r), [])
        
      
console.log(result)
.as-console-wrapper{min-height:100%;}
Yevgen Gorbunkov
  • 12,646
  • 3
  • 13
  • 31
-1

Typing on a mobile phone so pardon me being terse.

function magic(text) {
  const lines = text.split("\n")
  const re = /^AAAA k1="(\d+)" k2="(\d+)"$/
  const lastIndex = lines.length - 1
  return lines.reduce((acc, line, index) => {
    const matched = line.match(re)
    if (matched) {
      if (!acc.current) acc.current = {}
      acc.current[matched[1]] = matched[2]
    }

    if (!matched || index == lastIndex) {
      if (acc.current) {
        acc.final.push(acc.current)
        acc.current = null
      }
    }
    return acc
  }, { current: null, final: [] }).final
}
hackape
  • 11,966
  • 1
  • 15
  • 40