1

I am looking for regular expression for matching text commented in C style block comments (/* ... */) but only those blocks containing character ";" in it. For example following text

/*
some non code comment
*/

/*
some_code();
*/

/*
another non code comment
*/

important_code();

/*
yet another non code comment
*/

should match pattern around "some_code();", but not around the outter ones. The closest to solution I got is

/\*(.|\r?\n)*?;(.|\r?\n)*?\*/

but it unfortunately selects the first block as well. I was thinking that perhaps some way to disallow occurence of "/*" in the pattern would do the trick, but I have no idea how that can be done.

Any help would be appreciated. Solutions i found here or on the web are usually working with one line comments (//) containing ";" character or any block comments (not necessarily with ";" in it), nothing like I am describing here. Ideally it would be usable in Visual Studio 2013.

EDIT: Updated the example to cope with some corner cases.

  • If your language supports line comments `//`, then you can't find block style comments without finding line style comments as well. Additionally, you can't find any comments without parsing string literals as well. If you need a regex that does all of this and finds a semi-colon in the block comment, let me know. You'd be guessing if its done some other way. –  Jul 17 '15 at 01:12

1 Answers1

0

Here is the regex that can get the closest match:

/\*(?:(?!/\*|\*/)[\S\s\r])*;(?:(?!/\*|\*/)[\S\s\r])*\*/

See demo

The idea is to use tempered greedy token (also described on SO) to make sure we match as few and closest match as possible.

  • /\* - matches /*
  • (?:(?!/\*|\*/)[\S\s\r])* - the tempered greedy token
  • ; - the semi-colon inside /*...*/ block
  • (?:(?!/\*|\*/)[\S\s\r])* - the tempered greedy token
  • \*/ - matches literal */.
Community
  • 1
  • 1
Wiktor Stribiżew
  • 484,719
  • 26
  • 302
  • 397
  • Wow that was a quick answer, thanks! It works on original example, but it unfortunately finds false positives in anothers. Please see updated example in the question. In this example it also detects one block containg ; which is not inside any of the /* */ blocks. –  Jul 16 '15 at 21:08
  • Now, it should work OK, the speed is not that bad: acc. to [regexhero.net](http://regexhero.net/tester), it is about 35K iterations per second. – Wiktor Stribiżew Jul 16 '15 at 22:05