1

Trying to figure out a reg ex, for example, this string:

*/30 * * * * http://www.domain.com/wp-cron.php?doing_wp_cron >/dev/null 2>&1

I need to capture the beginning */30 * * * * and ending value of /wp-cron.php?doing_wp_cron >/dev/null 2>&1, but it doesn't matter what is in between those values.

Also, the beginning part of: */30 * * * *, can also sometimes be these values:

0 * * * * or 0 0 * * *

In which case, I'll need a regex for this as well, or, if one can do them all, that would be great!! The beginning part is actually within a string variable if that matters. Which can be any of the following values above.

How can I do this?

So far, I have something like this: ^\*\/30 [\*].+\s[*4]

But my regex skills are slim to none and this only matches the beginning... need to match the end, of if there is a way to put the site_url() into the regex also, and we can match the entire line, which would be preferred.

EDIT

Ok, so there is a variable like this: site_url(), this will return a WordPress Blog site URL in a string, like this: http://www.domain.com.

Now, there is another variable called, $updateinterval, this will contain any of the string values: */30 * * * *, or 0 * * * *, or 0 0 * * *

Now, I build a string from these variables like so:

$cron_job = $updateinterval . ' ' . site_url() . '/wp-cron.php?doing_wp_cron >/dev/null 2>&1';

It than uses this to insert a Cron Job in PHP (supposedly). However, in order to remove the cron job (thus cleaning up if there is no need for it anymore), I need to use a regex to find the line for the cron job, and edit the file needed to remove it.

Now, I can get the FULL string of what is set within $cron_job at anytime, even what was set previously. So, in that case, if it is possible to match the entire string $cron_job with a regex, that is what would be best. But, how to use these variables $updateinterval and site_url() within a regex?

So, I have a class that uses preg_grep to remove the line where the cron job is being set. But how to use it to match all of the possible values from the variables? Or just grab the start and end constants? Whichever way is fine with me.

* RE-EDIT* So, for example, it needs to search within a file, and if any of these are found (REGEX), than it is a match and will be removed by the cron job:

// let's just say that site_url() resolves to: http://www.domain.com for the sake of this. This will be different on a per site basis ofcourse.

*/30 * * * * http://www.domain.com/wp-cron.php?doing_wp_cron >/dev/null 2>&1
0 * * * * http://www.domain.com/wp-cron.php?doing_wp_cron >/dev/null 2>&1
0 0 * * * http://www.domain.com/wp-cron.php?doing_wp_cron >/dev/null 2>&1

Anything else should FAIL, anything at all besides the 3 strings above should fail! So the regex should only match the 3 given strings above and that's it. That is to say, whatever site_url() is, it should add it to the regex, if possible.

Solomon Closson
  • 5,592
  • 10
  • 54
  • 106
  • How about splitting by spaces? – PoByBolek May 12 '14 at 08:44
  • I don't know, doesn't seem right... don't want to interfere with other strings – Solomon Closson May 12 '14 at 08:49
  • Why is this question voted to be closed? It is a valid question is it not? – Solomon Closson May 12 '14 at 08:58
  • Is `http://www.testing.com` a constant string? – Robin May 12 '14 at 09:09
  • That is also a variable, and yes, it is constant. – Solomon Closson May 12 '14 at 09:12
  • Would something [like this](http://regex101.com/r/yT2aI3) fit your needs? To only match lines containing e.g. `foo` replace the 2nd capturing group `(\S+)` with [(\S+foo\S*)](http://regex101.com/r/iW0lG6). Use with `m` multiline [modifier](http://php.net/manual/en/reference.pcre.pattern.modifiers.php). [Another example](http://regex101.com/r/tE1bQ8) for capturing path. – Jonny 5 May 12 '14 at 10:30
  • Please add samples for input and what exactly you want to match/how output should look like. – Jonny 5 May 12 '14 at 10:42
  • @Jonny5 you should make that an answer, the only improvement you could make is to change the middle group `\s(\S+)\s` to `\s\S+(/\S+)\s` as OP said he only wanted the end of the url. – Mike H-R May 12 '14 at 11:05
  • Updated question, hopefully this explains it better now... – Solomon Closson May 12 '14 at 11:06
  • @Jonny5 - I'm not sure how a bunch of `S` and `s` characters are supposed to match that? Can you explain a bit? Cause it just seems like it will cause problems with other strings somehow... or maybe I misunderstand it...? – Solomon Closson May 12 '14 at 11:15
  • Yeah, this gives me a match, but it shouldn't match it `0 * * * * http://www.a.test/blah.php?doing_wp_cron >/testing/null 2>&1` The beginning and end are constants that need to be matched against. That is the question. – Solomon Closson May 12 '14 at 11:20
  • Thanks anyways, I am trying a bunch of methods, but nothing I do seems to work due to my lack of regex knowledge. :( – Solomon Closson May 12 '14 at 11:27
  • Did I get it right, that you search for lines, that contain a certain word "http://...`needle`.com/..." to remove them? Or that exactly contains, whatever `site_url()` may produce? Why not split then in lines and simply use [strpos](http://www.php.net/manual/de/function.strpos.php)? Sorry, for me is still unclear, what you need :) – Jonny 5 May 12 '14 at 11:37
  • whatever `site_url()` may produce is what I'd like to search for within the file, which is what the regex should match. `site_url()` will always be the same. but it is different on a per site basis, and would like to use the function instead of a constant string for this. Not sure what you mean by split them in lines with `strpos`... – Solomon Closson May 12 '14 at 11:51
  • Ok, I have re-edited my question again, please see Re-Edit, and hopefully that explains it in a nutshell. It should only match the 3 strings, everything else it should not match. Hope that clears it up. Thanks :) – Solomon Closson May 12 '14 at 11:59
  • 2
    @SolomonClosson Also see, if something similar [this example](https://eval.in/150054) would fit your needs then. – Jonny 5 May 12 '14 at 12:19
  • @Jonny5 - That looks to be EXACTLY what I need... OMG, you are my regex HERO! – Solomon Closson May 12 '14 at 12:21
  • 1
    haha :) great a solution found finally, happy – Jonny 5 May 12 '14 at 12:22

2 Answers2

2

Going on your third re-edit, if you want to be that specific it's really quite easy:

^((\d|\*(/\d{1,2})?)\s){5}http://www.\S+/.+doing_wp_cron\s>/dev/null\s2>&1$

but it really depends on how specific you want to get, I'd replace most of that with \S+ groups but it's up to you, (for example the /dev/null and the 2>&1 parts) but I made it as restrictive as possible given your requirements. the following are matched

*/30 * * * * http://www.domain.com/wp-cron.php?doing_wp_cron >/dev/null 2>&1
0 * * * * http://www.domain.com/wp-cron.php?doing_wp_cron >/dev/null 2>&1
0 0 * * * http://www.domain.com/wp-cron.php?doing_wp_cron >/dev/null 2>&1

see here

Mike H-R
  • 7,147
  • 5
  • 37
  • 58
1

Well, here is my approach then:

$pattern = '~^(?:\*/30\s\*|0\s[*0])(?:\s\*){3}\s'.preg_quote(site_url(),"~").'~';

The first part should match only the cases: */30 * * * *, 0 * * * *, 0 0 * * *

Using a (?: non-capturing group for the alternation and (?:\s\*){3} to match * * *.

Followed by whatever your site_url() outputs. So, just match it:

$arr_input = array(
"*/30 * * * * http://www.testing.com/wp-cron.php?doing_wp_cron >/dev/null 2>&1",
"0 * * * * http://www.domain.com/bar.php?doing_wp_cron >/dev/null 2>&1",
"0 * * * * http://www.a.test/b.php?doing_wp_cron >/dev/null 2>&1",
"*/30 * * * * http://www.domain.com/wp-cron.php?doing_wp_cron >/dev/null 2>&1",
"0 * * * * http://www.domain.com/wp-cron.php?doing_wp_cron >/dev/null 2>&1",
"0 0 * * * http://www.domain.com/wp-cron.php?doing_wp_cron >/dev/null 2>&1");

foreach($arr_input AS $v)
{
  if(preg_match($pattern, $v)) {
    echo "MATCH: ".$v."\n";
  } else {
    echo "FAIL: ".$v."\n";
  }
}

Test, Also see SO Regex FAQ

Community
  • 1
  • 1
Jonny 5
  • 11,051
  • 2
  • 20
  • 42
  • OMG, now it needs to be like this: `0,30 * * * * wget -O /dev/null http://www.testing.com/wp-cron.php?doing_wp_cron >/dev/null 2>&1`, and `0 * * * * wget -O /dev/null http://www.testing.com/wp-cron.php?doing_wp_cron >/dev/null 2>&1` and `0 0 * * * wget -O /dev/null http://www.testing.com/wp-cron.php?doing_wp_cron >/dev/null 2>&1` instead. Can you help with this please? – Solomon Closson May 14 '14 at 01:30
  • So, just need to add in `wget -O /dev/null ` before the URL. – Solomon Closson May 14 '14 at 01:39
  • 1
    Nevermind, I figured it out: `$pattern = '~^(?:0,30\s\*|0\s[*0])(?:\s\*){3}\s(?:wget\s\-O\s\/dev\/null)\s'.preg_quote(site_url(),"~").'~';` Cheers :) – Solomon Closson May 14 '14 at 01:42