0

My input is a string, I want to validate that there is only one first-level block of code.

Examples :

{ abc }              TRUE
{ a { bc } }         TRUE
{ a {{}} }           TRUE
{ abc {efg}{hij}}    TRUE
{ a b cde }{aa}      FALSE

/^\{.*\}$/ is valid for the 5 cases, can you help me to find a regex invalid for the last case ?

Language is JavaScript.

Michael M.
  • 2,408
  • 1
  • 12
  • 21

4 Answers4

4

EDIT: I started writing the answer before JavaScript was specified. Will leave it as for the record as it fully explains the regex.

In short: In JavaScript I cannot think of a reliable solution. In other engines there are several options:

  • Recursion (on which I will expand below)
  • Balancing group (.NET)

For solutions 2 (which anyhow won't work in JS either), I'll refer you to the example in this question

Recursive Regex

In Perl, PCRE (e.g. Notepad++, PHP, R) and the Matthew Barnett's regex module for Python, you can use:

^({(?:[^{}]++|(?1))*})$

The idea is to match exactly one set of nested braces. Anything more makes the regex fail.

See what matches and fails in the Regex Demo.

Explanation

  • The ^ anchor asserts that we are at the beginning of the string
  • The outer parentheses define Group 1 (or Subroutine 1)
  • { match the opening brace
  • (?: ... )* zero or more times, we will...
  • [^{}]++ match any chars that are not { or }
  • OR |
  • (?1) repeat the expression of subroutine 1
  • } match closing brace
  • The $ anchor asserts that we are at the end of the string. Therefore,
Community
  • 1
  • 1
zx81
  • 38,175
  • 8
  • 76
  • 97
  • 2
    Unfortunately, the OP has specified now that they're looking for javascript and this solution is invalid in javascript. +1 for the recursive regex demo, though. – RevanProdigalKnight Jul 25 '14 at 12:19
1

This is a terrible workaround.

Since this is in Javascript there's not really much to do, but please see the following regex:

/^{([^{}]*|{})*}$/

Where you copy ([^{}]*|{})* and insert it between the last pair of curly brackets (rinse and repeat). Every duplication of this pattern allows another level of nesting between your elements. (This is a workaround for the lack of recursion in JS regex, required to solve nesting problems.)

Online Regex Demo

Community
  • 1
  • 1
Unihedron
  • 10,251
  • 13
  • 53
  • 66
  • Very practical, +1 for copy-paste regex, that would probably work in realistic cases where nesting is limited. :) – zx81 Jul 26 '14 at 00:45
0

In JavaScript what you need to do is strip out all the nested blocks until no nested blocks are left and then check whether there are still multiple blocks:

var r = input.replace(/(['"])(?:(?!\1|\\).|\\.)*\1|\/(?![*/])(?:[^\\/]|\\.)+\/[igm]*|\/\/[^\n]*(?:\n|$)|\/\*(?:[^*]|\*(?!\/))*\*\//gi, '');

if (r.split('{').length != r.split('}').length || r.indexOf('}') < r.indexOf('{')) {
    // ERROR
    continue;
}

while (r.match(/\{[^}]*\{[^{}]*\}/))
    r = r.replace(/(\{[^}]*)\{[^{}]*\}/g, '$1');
if (r.match(/\}.*\{/)
    // FALSE
else
    // TRUE

Working JSFiddle

Be sure to make the regex in the while and the regex in the replace match the same otherwise this might result in infinite loops.

Updated to address ERROR cases and remove anything in comments, strings and regex-literals first after Unihedron asked.

asontu
  • 4,264
  • 1
  • 17
  • 25
  • Would return TRUE, but it's an "improper" code block. If you wanna check for unmatched open curly braces that's a different problem, though related indeed. But then you'd want to consider stuff like `{ print "Went there {" } /* } { Dang edge cases! } */`. I wrote about scenarios like that here: http://stackoverflow.com/questions/25402109/regex-for-comments-in-strings-strings-in-comments-etc – asontu Sep 02 '14 at 08:06
-1
(\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*|\(([^()]*\))*\))*\))*\))*\))*\))*\))*\))*\))*\))*\))*\))*\))*\))*\))*\))*\))*\))*\))*\))*\))*\))*\))*\))*\))*\))*\))*\))*\))*

Code for brackets

Jon Jin
  • 1
  • 2