1

The preg_replace() function has so many possible values, like:

    <?php
    $patterns = array('/(19|20)(\d{2})-(\d{1,2})-(\d{1,2})/', '/^\s*{(\w+)}\s*=/');
    $replace = array('\3/\4/\1\2', '$\1 =');
    echo preg_replace($patterns, $replace, '{startDate} = 1999-5-27');

What does:

 \3/\4/\1\2

And:

/(19|20)(\d{2})-(\d{1,2})-(\d{1,2})/','/^\s*{(\w+)}\s*=/ 

mean?

Is there any information available to help understand the meanings at one place? Any help or documents would be appreciated! Thanks in Advance.

Kesh
  • 243
  • 3
  • 13
  • The \w metacharacter is used to find a word character. + for one ore more if m not wrong – agpt Jul 12 '15 at 04:23
  • 1
    Check out [SO regex faq](http://stackoverflow.com/a/22944075/3110638), [Rexegg Cheat Sheet](http://www.rexegg.com/regex-quickstart.html), [explain regex](http://rick.measham.id.au/paste/explain.pl), [regex101](https://regex101.com/) (paste regex > explanation top right) – Jonny 5 Jul 12 '15 at 05:39

2 Answers2

4

Take a look at http://www.tutorialspoint.com/php/php_regular_expression.htm

\3 is the captured group 3
\4 is the captured group 4
...an so on...

\w means any word character.
\d means any digit.
\s means any white space.
+ means match the preceding pattern at least once or more.
* means match the preceding pattern 0 times or more.
{n,m} means match the preceding pattern at least n times to m times max.
{n} means match the preceding pattern exactly n times.
(n,} means match the preceding pattern at least n times or more.
(...) is a captured group.

Jahid
  • 18,228
  • 8
  • 79
  • 95
3

So, the first thing to point out, is that we have an array of patterns ($patterns), and an array of replacements ($replace). Let's take each pattern and replacement and break it down:

Pattern:

/(19|20)(\d{2})-(\d{1,2})-(\d{1,2})/

Replacement:

\3/\4/\1\2

This takes a date and converts it from a YYYY-M-D format to a M/D/YYYY format. Let's break down it's components:

/ ... / # The starting and trailing slash mark the beginning and end of the expression.
(19|20) # Matches either 19 or 20, capturing the result as \1.
        # \1 will be 19 or 20.
(\d{2}) # Matches any two digits (must be two digits), capturing the result as \2.
        # \2 will be the two digits captured here.
-       # Literal "-" character, not captured.
(\d{2}) # Either 1 or 2 digits, capturing the result as \3.
        # \3 will be the one or two digits captured here.
-       # Literal "-" character, not captured.
(\d{2}) # Either 1 or 2 digits, capturing the result as \4.
        # \4 will be the one or two digits captured here.

This match is replaced by \3/\4/\1\2, which means:

\3 # The two digits captured in the 3rd set of `()`s, representing the month.
/  # A literal '/'.
\4 # The two digits captured in the 4rd set of `()`s, representing the day.
/  # A literal '/'.
\1 # Either '19' or '20'; the first two digits captured (first `()`s).
\2 # The two digits captured in the 2nd set of `()`s, representing the last two digits of the year.

Pattern:

/^\s*{(\w+)}\s*=/

Replacement:

$\1 =

This takes a variable name encoded as {variable} and converts it to $variable = <date>. Let's break it down:

/ ... / # The starting and trailing slash mark the beginning and end of the expression.
^       # Matches the beginning of the string, anchoring the match.
        # If the following character isn't matched exactly at the beginning of the string, the expression won't match.
\s*     # Any whitespace character. This can include spaces, tabs, etc.
        # The '*' means "zero or more occurrences".
        # So, the whitespace is optional, but there can be any amount of it at the beginning of the line.
{       # A literal '{' character.
(\w+)   # Any 'word' character (a-z, A-Z, 0-9, _). This is captured in \1.
        # \1 will be the text contained between the { and }, and is the only thing "captured" in this expression.
}       # A literal '}' character.
\s*     # Any whitespace character. This can include spaces, tabs, etc.
=       # A literal '=' character.

This match is replaced by $\1 =, which means:

$  # A literal '$' character.
\1 # The text captured in the 1st and only set of `()`s, representing the variable name.
   # A literal space.
=  # A literal '=' character.

Lastly, I wanted to show you a couple of resources. The regex-format you're using is called "PCRE", or Perl-Compatible Regular Expressions. Here is a quick cheat-sheet on PCRE for PHP. Over the last few years, several tools have been popping up to help you visualize, explain, and test regular expressions. One is Regex 101 (just Google "regex tester" or "regex visualizer"). If you look here, this is an explanation of the first RegEx, and here is an explanation of the second. There are others as well, like Debuggex, Regex Tester, etc. But I find the detailed match breakdown on Regex 101 to be pretty useful.

Will
  • 21,498
  • 11
  • 84
  • 98