10

I have my Regular Expression /'(.*)(?:(?:'\s*,\s*)|(?:'\)))/
and my test code ('He said, "You're cool."' , 'Rawr')
(My test code simulates parameters being passed into a function.)

I will explain my Regular Expression as I understand it and hopefully a few of you can shed some light on my problem.

1)/' means at the beginning of the matched string, there needs to be '

2)(.*) means capture any character except \n 0 or more times

3)(?:(?:4)|(?:5)) means don't capture but try to do step 4 and if it doesn't work try step 5

4)(?:'\s*,\s*) means don't capture but there needs to be a ' with 0 or more whitespace characters followed by a , with 0 or more whitespace characters

5)(?:'\)) means don't capture but there needs to be ')

So it seems that it should return this (and this is what I want):
'+He said, "You're cool."+' ,
But it returns:
'+He said, "You're cool."' , 'Rawr+')

If I change my test code to ('He said, "You're cool."' , 'Rawr' (no end parenthesis) it returns what I want, but as soon as I add that last parenthesis, then it seems that my OR operator does whatever it wants to. I want it to test first if there is a comma, and break there if there is one, and if there is not one check for a parenthesis.

I've tried switching the spots of step 4 and step 5, but still the OR operator seems to always default to the (?:'\)) side. How can I match the shortest amount possible?

Aust
  • 10,892
  • 11
  • 41
  • 70
  • Beginning of string would be `/^`, not `/`, fwiw. Personally I'm not convinced I'd use a regex for what it is you're trying to do, but rather a small parser. Confusing regex is confusing. – Dave Newton Aug 29 '12 at 16:10
  • @DaveNewton - Yes I know that. That's why I said at the beginning of the matched string. Maybe I should've said at the beginning of the matched portion of the string. Or when it starts the match it needs to begin with a `'`. – Aust Aug 29 '12 at 16:12

1 Answers1

20

I don't think your problem is the OR operator, but the greediness of the .*. It will match your full string, and then back-track until the following expressions match. The first match in this backtracking process will be 'He said, "You're cool."' , 'Rawr+'). Try .*? instead!

Bergi
  • 513,640
  • 108
  • 821
  • 1,164