0

I'm using gskinner regex helper site to capture a date from a string and it works fine. But throws and error in my php script.

What I'm looking for is the date within the string.

Nov 26 2013 10:17PM

$string = "The following web lead was received at Nov 26 2013 10:17PM Source: 420 Source: Internet - Organic (Free) Leads Referral Fee:  none";  

$datePattern = '/(?<=received at )(?:[^])*?(?=Source)/';
preg_match($datePattern,$string,$matches);
print_r($matches);

The error I'm getting is

Warning: preg_match(): Compilation failed: missing terminating ] for character class at offset 36 in C:\wamp\www\test\index.php on line 114

I don't understand why it works fine in the gskinner tool but fails in my script. This regex is one of the community submitted expressions as I am completely incompetent when it comes to regex.

Thanks for any help.

falsetru
  • 314,667
  • 49
  • 610
  • 551
Steve Peterson
  • 132
  • 1
  • 4
  • 16
  • 1
    Do you see anything wrong in the syntax highlighting of your post? – jordanm Dec 01 '13 at 06:03
  • 2
    What do you want to negate here: `[^]` ? I believe the regex engine in PHP thinks that you are negating the ] itself, thus it cannot find any closing bracket. – bagonyi Dec 01 '13 at 06:03
  • rubular.com is also a good site (maybe easier than the one you are using). – リカルド Dec 01 '13 at 06:08
  • jordanm... fixed the syntax error. but that is only here on the post, not in my script. – Steve Peterson Dec 01 '13 at 06:08
  • bagonyi... not sure I know how to fix that. this is an expression from the list of community submitted expression that I modded a bit. If I take out the '^', it doesn't fix it. I just need to capture everything between "received at" and "Source" as those terms are consistent and unique. – Steve Peterson Dec 01 '13 at 06:11

2 Answers2

6

It is because of the [^].

With some javascript implementations [^] means literaly "all possible characters" (the negation of nothing). But in php, the closing square bracket is seen as literal if it stand immediatly after the opening bracket or the negation symbol ^. Thus [^])*?(?=Source) is seen as an unclosed character class.

The goal of this notation was to match all possible characters (a kind of shortcut for [\s\S]), you can replace it by:

$datePattern = '/(?<=received at ).*?(?=Source)/s';

You can read more informations about these kind of notations in this incredible post.

Community
  • 1
  • 1
Casimir et Hippolyte
  • 83,228
  • 5
  • 85
  • 113
0

^ has a special meaning inside a character group, it cannot stand alone as [^]. In this context it means not those chars, as in [^abc] means anything but a, b and c, but you aren't specifying any.

And since its only one symbol, you don't even need to put it in a character group.

Havenard
  • 23,249
  • 4
  • 31
  • 59