4

I have strings like : {$foo.bar} and {$foo.bar.anything}

WHERE : foo AND bar AND anything === alphanumeric

i want to match the above 2 strings in PHP via preg_match(regular expression) except those without any dot for example : {$foo}

Your help will be much appreciated, thanks.

Tim Cooper
  • 144,163
  • 35
  • 302
  • 261
Ranbir Kapoor
  • 95
  • 1
  • 1
  • 6

7 Answers7

3
/{\$[\da-z]+(?:\.[\da-z]+)+}/i

matches

{$foo.bar}
{$foo.Bar.anything}
{$foo.bar.anything1.anything2.anything3}
{$foo.bar.anything.a.b.c}

does not match

{$foo}
{$foo.}
{$foo bar}
{$foo.bar anything}
{$foo.bar......anything..}
{$foo.bar.anything.}
{$foo.bar.anything.a.b.c..}

Adopted Joe’s PCRE case-insensitive modifier to shorten it a bit.

Special thanks to sln for keeping me on my toes until it’s perfect. :)

Community
  • 1
  • 1
Herbert
  • 5,302
  • 2
  • 20
  • 34
2

Assuming php regex is the same as perl

^\w+\.[\.\w]+$

That means starting with one or more alphanumeric, followed by a ., followed by a number of alphanumerics or .. The $ means all the way to the end of the string.

If it cannot end with a . then

^\w+\.[\.\w]+\w$

If .. is not allowed It gets tricker as not ell regex engines handle specifying repetitions of multi char sub expressions. But if your's does I think its something like

^\w+(\.\w+)+$

That means starting with one or more alphanumeric, followed one or more repetions of by a . followed one or more alphanumerics. The $ means all the way to the end of the string.

Sodved
  • 7,982
  • 28
  • 39
  • `\w` matches letters, digits, and underscores. – Shef Sep 11 '11 at 15:55
  • PHP regex is the same as Perl (when you use preg_match). The "p" is for PCRE (Perl Compatible Regular Expression). – Herbert Sep 11 '11 at 15:57
  • @Sodved: I think you want `[^\W_]+(?:\.[^\W_]+)+` to mitigate underscores. Also, its unclear if `^$` anchors are necessary. –  Sep 11 '11 at 18:12
2

You probably want preg_match_all rather than preg_match - it gets all matches, as the name suggests, rather than just the first one.

As for the regex you want, something like this should work

/\{\$[a-z0-9]+\.([a-z0-9\.]+)+\}/i
Joe
  • 15,062
  • 4
  • 38
  • 77
2
/(\{\$[a-z]+\.([a-z][a-z.])*[a-z]+\})/

So you first match foo and a dot {$foo., then optionally any characters and dots {$foo.bar., and finally another string of characters. {$foo.bar.anything}

Wulf
  • 3,738
  • 2
  • 19
  • 36
  • The unescaped period in the middle parentheses will match any character – Joe Sep 11 '11 at 15:57
  • @Joe: Nope, it's OK, inside a character class it will match dot. – Shef Sep 11 '11 at 16:01
  • 2
    Mm, didn't know that. I do know for sure that the unescaped $ at the start is trying to match end-of-string though :P – Joe Sep 11 '11 at 16:03
  • @Joe: Definitely that dollar sign must be escaped! :) – Shef Sep 11 '11 at 16:05
  • @Wulf: Why go through all the duplicity (and backtracking)? Can't you just reduce it to `/(\{\$[a-z]+(?:\.[a-z]+)+\})/` ? –  Sep 11 '11 at 18:25
2
\{\$[A-Za-z0-9]+\.[A-Za-z0-9]+\.?[A-Za-z0-9]*\}
Shef
  • 41,793
  • 15
  • 74
  • 88
  • 2
    Worth noting this will only match string.substring.subsubstring, and if there are any string with more than 2 periods in the (1.2.3.4) then it won't match – Joe Sep 11 '11 at 15:58
  • @Joe: Yes, that's right. It seems like that's what the OP is after, but maybe I am wrong. – Shef Sep 11 '11 at 15:59
2
\{\$[a-zA-Z0-9]+(\.[a-zA-Z0-9]+)+\}

First match {$. Then match any alphanumeric string. Then match any alphanumeric strings beginning with .. Then match }.

makes
  • 6,090
  • 3
  • 37
  • 57
2

This is my solution to the problem, with some alternatives depending on what you exactly want to extract.

  1. Extracts just the whole {$aaa.bbb[.ccc[.ddd ...]]} thing, provided that it contains at least one dot
  2. Extracts the content from the {$aaa.bbb} thing (eg. aaa.bbb)
  3. Consider only tags composed by two or three components (ignore {$aaa} or {$aaa.bbb.ccc.ddd}).

Code:

<?php

$subject = '{$foo.bar} {$foo.bar.baz} {$foo} {$another-foo.bar} {$foo.bar.baz.boh}';

print "Matching the whole string\n";
preg_match_all(
   '/{\$[a-zA-Z0-9]+(?:\.[a-zA-Z0-9]+)+}/',
   $subject, $m);
print var_export($m) ."\n\n";

print "Matching only the content\n";
preg_match_all(
   '/{\$([a-zA-Z0-9]+(?:\.[a-zA-Z0-9]+)+)}/',
   $subject, $m);
print var_export($m) ."\n\n";

print "Matching for strings containing only 1 or two dots\n";
preg_match_all(
   '/{\$([a-zA-Z0-9]+(?:\.[a-zA-Z0-9]+){1,2})}/',
   $subject, $m);
print var_export($m) ."\n\n";
redShadow
  • 6,239
  • 1
  • 25
  • 33