1

Am doing some semantic analysis on C++ source code. I have a regular expression to transform the array declarations from int [123] [1234] to int [number] [number].

But I want the expression also be able to match dimensions such as these int [i * x][ring_size][w + 6].

How do I tell it to match anything inside (symbols and spaces included) [ ]?

My regular expression so fas is: regex arrayDims("\\[[0-9]+\\]");. I am using C++11 regex header.

Thank you

Peter K
  • 1,292
  • 8
  • 24
Andreas Geo
  • 317
  • 2
  • 14
  • 1
    I think using a regular expression approach for this is probably impossible in full generality (although I don't have a proof for that). You'll always be up against edge cases like `int[ i[3]]` for example. Lambda capture lists will also be challenging. I think you need to build a tokeniser based on the C++ grammar. Set aside a few months for that! Nothing wrong with the question though; plus one. – Bathsheba Aug 18 '16 at 09:10

2 Answers2

1

If you want to match anything inside [], then try using \\[.+?\\], or something similar. The ? turns the * into non-greedy. Read more on this page.

Edit: I have to note that, while this works for slightly more complex expression than just numbers, if there are more [] inside the expression, this will not work.

E.g. applying my pattern to array[anotherarray[5]] results in [anotherarray[5], instead of [anotherarray[5]] (note the extra bracket at the end).

See this answer for more information on bracket matching.

Graham
  • 6,577
  • 17
  • 55
  • 76
Peter K
  • 1,292
  • 8
  • 24
0

If you want to match arithmetic expressions like [w + 6], then you cannot do it with regular expressions. Arithmetic expressions are beyond the power of regular expressions. You have to use some kind of context-free parser.

AhmadWabbi
  • 2,203
  • 1
  • 19
  • 30