matching both possible rel values in a

Question

I'm currently parsing for webmention endpoints with the following code. This works for either <link rel="webmention" href=" "> or <link rel="http://webmention.org/" href=" "> but not if both are included, i.e. <link rel="webmention http://webmention.org/" href=" ">. And I'm struggling to adapt it to that. The code currently:

if(preg_match('/<(?:link|a)[ ]+href="([^"]+)"[ ]+rel="webmention"[ ]*\/?>/i', $body, $match)
  || preg_match('/<(?:link|a)[ ]+rel="webmention"[ ]+href="([^"]+)"[ ]*\/?>/i', $body, $match)) {
    $endpoint = $match[1];
} elseif(preg_match('/<(?:link|a)[ ]+href="([^"]+)"[ ]+rel="http:\/\/webmention\.org\/?"[ ]*\/?>/i', $body, $match)
  || preg_match('/<(?:link|a)[ ]+rel="http:\/\/webmention\.org\/?"[ ]+href="([^"]+)"[ ]*\/?>/i', $body, $match)) {
    $endpoint = $match[1];
}

Anyone any ideas?

use http://php.net/manual/en/class.domdocument.php for html manipulation — Awlad Liton, Sep 01 '14 at 12:55

php_nub_qq · Answer 1 · 2014-09-01T13:06:42.283

1

I just wrote this piece, see if you find it useful

'/\<link.+?rel=["']?(?:webmention|http\:\/\/webmention\.org\/?)['"]?.*?\>/g'

DEMO

P.S.: A word of advice - You use regular expressions like they are nothing. Regular expressions should be used only if there is no other way, especially if $body is a large string, you should really not run so many preg_matches on it. Cheers!

edited Sep 01 '14 at 13:06

answered Sep 01 '14 at 13:00

php_nub_qq

12,762
17
59
123

It was code I'd inhereited from another project, based on the other comment I'm trying DOMDocument, which seems to work: – Jonny Barnes Sep 01 '14 at 13:07
1

@JonnyBarnes I think it would be highly inefficient to parse a whole html document into PHP objects in order to pick up one `DOM` element, but that is just my opinion. – php_nub_qq Sep 01 '14 at 13:09

matching both possible rel values in a

1 Answers1