9

I am trying to figure out if a string has properly closed brackets.

To do this, I use the following three bracket pairs.

[]
()
{}

The brackets may also be nested as long as they're formatted properly.

)([]{} - Does not have properly closed brackets because )( is reverse order

[()] - Does contain properly closed brackets.

I've tried using regex and after a bit of fumbling around, I got this.

[^\(\[]*(\(.*\))[^\)\]]*

However, there are a few problems with this.

It only matches parentheses but doesn't match brackets

I don't understand why it didn't match the brackets.

In my examples I clearly used a backslash before the brackets.

Input

[] - true
[()] - true (nested brackets but they match properly)
{} - true
}{ - false (brackets are wrong direction)
}[]} - false (brackets are wrong direction)
[[]] - true (nested brackets but they match properly
Johnson
  • 1,411
  • 1
  • 12
  • 17
  • 1
    How about you show us some inputs and expected outputs. – Thank you Sep 22 '14 at 17:03
  • @Joh it's impossible task for regex . You need to updated your question with this input `([][()]{}{{}[]})` – Avinash Raj Sep 22 '14 at 17:35
  • @AvinashRaj You are absolutely wrong. – sawa Sep 22 '14 at 17:45
  • @AvinashRaj It is possible. – sawa Sep 22 '14 at 17:55
  • @sawa through regex only? – Avinash Raj Sep 22 '14 at 17:57
  • Where is the regex? Did you mean your regex? – Avinash Raj Sep 22 '14 at 17:59
  • @sawa even if possible with freaky recursive patterns, you can't disagree it's certainly is not a good task for regex. It can be done in O(n) with a simple stack parser like suggested by Jeff Price – dfherr Sep 22 '14 at 18:00
  • @ascar That is called ignoratio elenchi. But at least you seem to know better than Avinash Raj. – sawa Sep 22 '14 at 18:03
  • @sawa i won't argue it is certainly impossible and never did. Just wanted to give this conversation a hint in a more productive direction. But proof of a pure regex solution (working with rubys regex engine) isn't anywhere in the answers yet either. – dfherr Sep 22 '14 at 18:06
  • @ascar All I did was noted people that AvinashRaj's claim is absolutely wrong so that they won't learn something wrong. – sawa Sep 22 '14 at 18:08
  • the language of strings with matching brackets is not regular. strictly speaking a language extension to match this is not a regular expression. – 1010 Sep 22 '14 at 18:29
  • What's your plan for balanced but improperly-nested characters? For example, `balanced? '[{(])}'` would generally return true unless you're parsing grammar. – Todd A. Jacobs Sep 22 '14 at 18:49
  • 2
    Please read the answers. I posted working solution based on recursive regexp. By the way, regex solution will work in O(n) time too, though maybe with bigger constant. – Aivean Sep 22 '14 at 18:54

8 Answers8

9
non_delimiters = /[^(){}\[\]]*/
Paired = /\(#{non_delimiters}\)|\{#{non_delimiters}\}|\[#{non_delimiters}\]/
Delimiter = /[(){}\[\]]/

def balanced? s
  s = s.dup
  s.gsub!(Paired, "".freeze) while s =~ Paired
  s !~ Delimiter
end

balanced?(")([]{}")
# => false
balanced?("[]")
# => true
balanced?("[()]")
# => true
balanced?("{}")
# => true
balanced?("}{")
# => false
balanced?("}[]}")
# => false
balanced?("[[]]")
# => true
sawa
  • 156,411
  • 36
  • 254
  • 350
  • Can you explain the reason for [Object#freeze](http://ruby-doc.org/core-2.1.3/Object.html#method-i-freeze)? It doesn't seem to matter for your examples. – Cary Swoveland Sep 22 '14 at 20:37
  • 1
    @CarySwoveland A feature in recent Ruby optimizes string generation. Whenever there is a string literal on which `freeze` is applied, that string is only generated once whatever times it may be read. – sawa Sep 23 '14 at 01:10
  • 1
    you can add the magic comment `# frozen_string_literal: true` on the top line of a file in ruby 2.x to do this automatically for all string literals in the file – aaaarrgh Sep 12 '18 at 03:10
7

This is likely a bad use case for a regex, I would use a simple stack parser.

def matching_brackets?(a_string)
  brackets =  {'[' => ']', '{' => '}', '(' => ')'}
  lefts = brackets.keys
  rights = brackets.values
  stack = []
  a_string.each_char do |c|
    if lefts.include? c
      stack.push c
    elsif rights.include? c
      return false if stack.empty?
      return false unless brackets[stack.pop].eql? c
    end
  end
  stack.empty?
end

matching_brackets? "[]"
matching_brackets? "[()]"
matching_brackets? "{}"
matching_brackets? "}{"
matching_brackets? "}[]}"
matching_brackets? "[[]]"
matching_brackets? "[[{]}]"

edit: Cary Swoveland - write actual code and have folks criticize it :-?.

updated: Had a nasty little bug in that my check to see if the closing character matched the opening one. fixed it!

Jeff Price
  • 3,139
  • 20
  • 24
5

According to this article, Ruby from version 2.0 supports recursive regexps. This means that you can use Ruby-specific token \g<0> to recursively match the whole your regexp at any point of your regexp. This approach can effectively emulate stack in order to solve your task.

Here is the resulting regexp: [^(){}\[\]]*(\((\g<0>)?\)|\{(\g<0>)?\}|\[(\g<0>)?\])?[^(){}\[\]]*

This updated version handles cases like this: [(){}], when multiple bracket groups are at the same level. Thanks @Jonny 5 for pointing at this case:

 [^(){}\[\]]*((\((\g<0>)?\)|\{(\g<0>)?\}|\[(\g<0>)?\])?[^(){}\[\]]*)*

This expression requires the check if the whole input string is matched. Partial match means that there is there is error in brackets ordering at some point of the string.

Here is other version that doesn't require to check if whole input string is matched:

 \A([^(){}\[\]]*((\((\g<1>)?\)|\{(\g<1>)?\}|\[(\g<1>)?\])?[^(){}\[\]]*)*)\Z

You may notice that it tries to match corresponding pair of brackets and then recursively matches itself. I've tried it here and it seems to work. I'm not Ruby engineer so I can't run an actual Ruby test, but hope that it is not necessary.

Aivean
  • 8,465
  • 20
  • 26
  • It is also possible with Ruby 1.9, though as not as elegant as with 2.0. – sawa Sep 22 '14 at 18:23
  • 1
    @sawa, formally speaking, recursion call for the whole regexp Ruby supports from version 2.0. Since 1.9 Ruby [can recourse only specific groups by number or name](http://www.regular-expressions.info/subroutine.html). So in my example you have to wrap whole regexp in group `()` and use `\g<1>` instead of `\g<0>`. – Aivean Sep 22 '14 at 18:29
  • 1
    Very cool, learn something new every day. However, it's worth noting that finding that regex in code I were to have to maintain would likely end badly for myself, or the person in git blame. – Jeff Price Sep 22 '14 at 18:42
  • Can you check this: `"][" =~ r #=> 0`? Does `[^(){}\[\]]*` do anything? – Cary Swoveland Sep 22 '14 at 19:59
  • 1
    `((.)(.))` does not match or let's say it "matches" by meaning `true`. `true` also for `([`. It only deals with something like `(a[b])` but not `(a[b][b])`. However gave plus for mentioning the recursion. – Jonny 5 Sep 22 '14 at 20:05
  • @CarySwoveland sorry, I can't understand the syntax of your statement. If my assumption is correct and you are asking if the string "][" will match against my regexp, then answer is no. – Aivean Sep 22 '14 at 20:06
  • 1
    @Jonny5 thank you, I completely missed the case when several bracket groups go one after another. I'll correct the regexp. – Aivean Sep 22 '14 at 20:08
  • 1
    What I meant was that when I set `r` equal to your regex, `"][" =~ r #=> 0`, meaning that `"]["` matches, beginning at offset `0`, which it should not. It's matching an empty substring: `"]["[r] #=> ""`. – Cary Swoveland Sep 22 '14 at 20:15
  • @CarySwoveland As I said, I'm not a Ruby programmer. Do you have something like matchAll in Ruby? All you need to do is to check if the **whole** input string matches against the regexp. If it matches partially (like in your example) this clearly means that there is error in bracket order at that point of the string. – Aivean Sep 22 '14 at 20:20
  • The problem is, that it also matches `()` in `())` and almost (?) always returns `true`. What do I have to enter, that it does not return `true` ? I thought it should verify the whole string according the desired rules of OP and if string is not valid not match or let's say: return `false`. – Jonny 5 Sep 22 '14 at 20:20
  • @Jonny5 Ok, this variant should work out of box: `\A([^(){}\[\]]*((\((\g<1>)?\)|\{(\g<1>)?\}|\[(\g<1>)?\])?[^(){}\[\]]*)*)\Z` – Aivean Sep 22 '14 at 20:33
  • 1
    @Aivean Thanks for spending so much time, this looks awesome now! From my tired eyes at least :) – Jonny 5 Sep 22 '14 at 20:38
2

regex is not meant to validate correct grammar in strings and therefor very badly suited for that. Regex is a tool to find patterns in text.

You should use a parser.

Here is ruby-code for a stack parser doing that:

def validBrackets?(str)
  stack = []
  str.each_char do |char|
    case char
    when '{', '[', '('
      stack.push(char)
    when '}'
      x = stack.pop
      return false if x != '{'
    when ']'
      x = stack.pop
      return false if x != '['
    when ')'
      x = stack.pop
      return false if x != '('
    end
  end
  stack.empty?
end
dfherr
  • 1,503
  • 10
  • 21
  • My apologies. I will edit this question to come up with another solution. I thought regex was the right way to solve this – Johnson Sep 22 '14 at 17:10
  • 1
    You can use `case` statements to make your code look better. But `StringScanner` would be more efficient. – sawa Sep 22 '14 at 18:21
  • @sawa `case` does not make it much better in this case as i have to split the first if in 3 when clauses or use the case `char == ` syntax, which basically looks the same as if else – dfherr Sep 22 '14 at 18:35
  • 1
    No. You can put them together with commas. (Or, you can have a single regex there covering the three characters.) – sawa Sep 22 '14 at 18:35
  • @sawa did not know that. edited for a case statement and used each_char as it should be a bit faster than creating a char array first (though it could be the same with the underlying C implementation where a string is a char array anyway) – dfherr Sep 22 '14 at 18:42
  • is the "?" in validBrackets?(str) meant to return a boolean? it seems to be part of the method name. – kamal May 19 '17 at 05:41
  • @kamal it is part of the method name. But it is customary in ruby to add a question mark behind methods that return a boolean – dfherr May 19 '17 at 14:34
2

I assume your string consists only of the characters in the string "()[]{}". Notice that for a string str to satisfy the matching requirement:

  • str must be empty or contain a substring "()", "[]" or "[]"; and
  • if str is non-empty, str with "()", "[]" and "[]" removed satisfies the matching requirement.

We therefore can sequentially remove substring pairs until we can no longer do so. If what is left is empty, the original string satisfies the matching requirement; else it does not:

def matching?(str)
  return true if str.empty?
  s = str.gsub(/\(\)|\[\]|\{\}/,"")
  return false if s == str
  matching?(s)
end

matching?(")([]{}")         #=> false
matching?("[()]")           #=> true
matching?("[()[{()}]{()}]") #=> true 
Cary Swoveland
  • 94,081
  • 5
  • 54
  • 87
  • 1
    It is quite simple to modify your approach to allow other characters within the string. – sawa Sep 22 '14 at 17:47
  • certainly working, but should be way slower than jeff price algorithm description which has to look at each character of the string only onces. – dfherr Sep 22 '14 at 17:50
  • 1
    That's true, @ascar. Using a stack is the natural way to address this problem, but it had already been suggested when I got to this question, so I thought I'd demonstrate a different approach. – Cary Swoveland Sep 22 '14 at 18:04
2

Edit: moved to top per Jonny 5's suggestion
After reading the comments below and inspired by Aivean's solution, here is a modified pattern
(\[([^][)({}]|\g<0>)*\])|\(\g<2>*\)|\{\g<2>*\}


if your regex engine supports recursion, I suggest using 3 different patterns as filters, if your inputs passes all three it is a good match

([(?:[^][]|(?R))]) # match nested [] 
(((?:[^)(]|(?R)))) # match nested ()
({(?:[^{}]|(?R))*}) # match nested {}

alpha bravo
  • 7,292
  • 1
  • 14
  • 22
  • you won't find mismatches like `({)}` with this i guess. – dfherr Sep 22 '14 at 17:57
  • @ascar i like your commenting. Could you post the above comment on all the answers which uses regex? – Avinash Raj Sep 22 '14 at 17:58
  • 1
    I think you should put the final pattern on top of your answer, it's really neat one and easily overlooked. Would add [start and end](http://regex101.com/r/kP7pK8/2) if that's right :) – Jonny 5 Sep 23 '14 at 11:37
2

Alright, I figured this out.

def valid_string?(brace)
  stack = []
  brackets = { '{' => '}', '[' => ']', '(' => ')' }
  brace.each_char do |char|
    stack << char if brackets.key?(char)
    return false if brackets.key(char) && brackets.key(char) != stack.pop
  end
  stack.empty?
end
Johnson
  • 1,411
  • 1
  • 12
  • 17
1

I don't see how you could expect your regex to match brackets. Here's what your regex does:

[^\(\[]*  # Match any number of characters except ( or [
(         # Start capturing group:
 \(       # Match ( 
 .*       # Match any number of characters (except linebreaks)
 \)       # Match )
)         # End of capturing group
[^\)\]]*  # Match any number of characters except ) or ]
Tim Pietzcker
  • 297,146
  • 54
  • 452
  • 522