PHP find all links in the text

Question

I want to find all links in the text like this:

Test text http://hello.world Test text 
http://google.com/file.jpg Test text https://hell.o.wor.ld/test?qwe=qwe Test text 
test text http://test.test/test

I know i need to use preg_match_all, but have only idea in the head: start search from http|https|ftp and end search where space or end of the text appears, thats all i need really, so all links wiil be found properly.

Anyone can help me with php regexp pattern?

I think i need to use assertions in the end of pattern, but can`t understand their properly usage for now.

Any ideas? Thanx!

Does [This regex][1] that I provided before fit the bill for you? [1]: http://stackoverflow.com/questions/287144/need-a-good-regex-to-convert-urls-to-links-but-leave-existing-links-alone/10500178#10500178 — Matt, Apr 29 '14 at 14:04
If you say that it needs to start with either http, https or ftp and end with a space, you could simply use `(?:https?|ftp)://\S+`, note that `\S+` means match a non-whitespace character one or more times. — HamZa, Apr 29 '14 at 14:04
@swamprunner7 `(?:)` is a non-capturing group, [check this out](http://stackoverflow.com/questions/3512471/non-capturing-group) and bookmark/add to favorites [this reference](http://stackoverflow.com/questions/22937618/reference-what-does-this-regex-mean) — HamZa, Apr 29 '14 at 17:41

score 23 · Accepted Answer · edited Apr 29 '14 at 14:30

23

I'd go with something simple like ~[a-z]+://\S+~i

starts with protocol [a-z]+://
\S+ followed by one or more non-whitespaces where \S is a shorthand for [^ \t\r\n\f]
used modifier i (PCRE_CASELESS) (possibly not really necessery)

So it could look like this:

$pattern = '~[a-z]+://\S+~';

$str = 'Test text http://hello.world Test text 
http://google.com/file.jpg Test text https://hell.o.wor.ld/test?qwe=qwe Test text 
test text http://test.test/test';

if($num_found = preg_match_all($pattern, $str, $out))
{
  echo "FOUND ".$num_found." LINKS:\n";
  print_r($out[0]);
}

outputs:

FOUND 4 LINKS:
Array
(
    [0] => http://hello.world
    [1] => http://google.com/file.jpg
    [2] => https://hell.o.wor.ld/test?qwe=qwe
    [3] => http://test.test/test
)

Test on eval.in

edited Apr 29 '14 at 14:30

HamZa

13,530
11
51
70

answered Apr 29 '14 at 14:18

Jonny 5

11,051
2
20
42

1

Need to test it more, but seems it`s works! Thank you so much! :) Now i will use your patter for finding all liks and checking them for files, whole idea to find all file links, but, now some sites like to do pretty links like test.com/superfile without extension, so this code can help me a lot :) – swamprunner7 Apr 29 '14 at 14:27
1

Welcome, glad could have been of help @swamprunner7 – Jonny 5 Apr 29 '14 at 14:30
Great, but how can i doit if i need to find links but in a tag? – Luis Alfredo Serrano Díaz May 24 '20 at 02:38
This worked better for me than https://stackoverflow.com/a/5690614/1436129 – aubreypwd Aug 04 '20 at 16:55
Thank you working perfectly as expected. – saravana Apr 13 '21 at 11:20

score 4 · Answer 2 · edited Jan 30 '19 at 12:01

function turnUrlIntoHyperlink($string){
    //The Regular Expression filter
    $reg_exUrl = "/(?i)\b((?:https?:\/\/|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'\".,<>?«»“”‘’]))/";

    // Check if there is a url in the text
    if(preg_match_all($reg_exUrl, $string, $url)) {

        // Loop through all matches
        foreach($url[0] as $newLinks){
            if(strstr( $newLinks, ":" ) === false){
                $link = 'http://'.$newLinks;
            }else{
                $link = $newLinks;
            }

            // Create Search and Replace strings
            $search  = $newLinks;
            $replace = '<a href="'.$link.'" title="'.$newLinks.'" target="_blank">'.$link.'</a>';
            $string = str_replace($search, $replace, $string);
        }
    }

    //Return result
    return $string;
}

It's nice that this method captures URLs without http/https at the beginning, but I believe this method will not work properly if the same URL appears multiple times in the text. — smoyth, May 04 '21 at 18:36

score 2 · Answer 3 · answered Apr 29 '14 at 14:01

2

<?php

// The Regular Expression filter
$reg_exUrl = "/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/";

// The Text you want to filter for urls
$text = "The text you want to filter goes here. http://google.com";

// Check if there is a url in the text
if(preg_match($reg_exUrl, $text, $url)) {

       // make the urls hyper links
       echo preg_replace($reg_exUrl, "<a href="{$url[0]}">{$url[0]}</a> ", $text);

} else {

       // if no urls in the text just return the text
       echo $text;

}
?>

Reference:http://css-tricks.com/snippets/php/find-urls-in-text-make-links/

answered Apr 29 '14 at 14:01

hellosheikh

2,741
8
43
106

There's a comment on this article showing some errors in the above code which are easily fixed. The code works fantastic as long as there is only one URL in the text. But if you add another, it simply keeps repeating the first URL over. – wordman Nov 03 '18 at 16:34
Works perfectly. thanks for the code – Eben Watts Mar 14 '21 at 10:28

score 2 · Answer 4 · answered Apr 29 '14 at 14:12

2

Works like a charm. use this.

$str= "Test text http://hello.world";
preg_match_all('/\b(?:(?:https?|ftp|file):\/\/|www\.|ftp\.)[-A-Z0-9+&@#\/%=~_|$?!:,.]*[A-Z0-9+&@#\/%=~_|$]/i', $str, $result, PREG_PATTERN_ORDER);
print_r($result[0]);

answered Apr 29 '14 at 14:12

Vinoth Sd

112
5

Why you chose not to catch http? – Johnny Feb 12 '17 at 12:24
It is used as optional - https? - s? may be set or not – Vladimir May 25 '20 at 19:02

score 1 · Answer 5 · answered Feb 12 '17 at 13:21

The suggested answers are great, but one of them miss www. case, the other http://

So, let's combine all of those:

$text = Test text http://hello.world Test text 
http://google.com/file.jpg Test text https://hell.o.wor.ld/test?qwe=qwe Test text 
test text http://test.test/test

preg_match_all('/(((http|https|ftp|ftps)\:\/\/)|(www\.))[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\:[0-9]+)?(\/\S*)?/', $text, $results, PREG_PATTERN_ORDER);

print_r($results[0]);

The return value for PREG_PATTERN_ORDER will be Array of Arrays (results) so that $results[0] is an array of full pattern matches, $results[1] is an array of strings matched by the first parenthesized subpattern, and so on.

score 1 · Answer 6 · answered Mar 14 '21 at 10:29

function turnUrlIntoHyperlink($string) { // The Regular Expression filter $reg_exUrl = "/(http|https|ftp|ftps)://[a-zA-Z0-9-.]+.[a-zA-Z]{2,3}(/\S*)?/";

// Check if there is a url in the text
if (preg_match($reg_exUrl, $string, $url)) {
    // make the urls hyper links
    echo preg_replace($reg_exUrl, "<a target='_blank' href='{$url[0]}'>{$url[0]}</a>", $string);
} else {
    // if no urls in the text just return the text
    echo $string;
}

}

score 0 · Answer 7 · answered Apr 29 '14 at 14:30

0

Alternative to regexp it´s use this library

Works very good, butnot for very complex codes.

foreach($html->find('a') as $element) 
       echo $element->href . '<br>';

And easy to use. No regular expressions skills required:-)

answered Apr 29 '14 at 14:30

David

363
4
23

1

There is no html code, so there's no `a` tags to parse – HamZa Apr 29 '14 at 14:31

score 0 · Answer 8 · answered Aug 25 '20 at 01:45

Not regexp, but finds it all and makes sure that they are not already encompassed in a tag already. It also checks to make sure that the link isn't encapsulated in (), [], "" or anything else with an open and close.

$txt = "Test text http://hello.world Test text 
http://google.com/file.jpg Test text https://hell.o.wor.ld/test?qwe=qwe Test text 
test text http://test.test/test <a href=\"http://example.com\">I am already linked up</a>
It was also done in 1927 (http://test.com/reference) Also check this out:http://test/index&t=27";
$holder = explode("http",$txt);
for($i = 1; $i < (count($holder));$i++) {
    if (substr($holder[$i-1],-6) != 'href="') { // this means that the link is not alread in an a tag.
        if (strpos($holder[$i]," ")!==false) //if the link is not the last item in the text block, stop at the first space
            $href = substr($holder[$i],0,strpos($holder[$i]," "));
        else                                //else it is the last item, take it
            $href = $holder[$i];
        if (ctype_punct(substr($holder[$i-1],strlen($holder[$i-1])-1)) && ctype_punct(substr($holder[$i],strlen($holder[$i])-1)))
            $href = substr($href,0,-1);     //if both the fron and back of the link are encapsulated in punctuation, truncate the link by one
        $holder[$i] = implode("$href\" target=\"_blank\" class=\"link\">http$href</a>",explode($href,$holder[$i]));
        $holder[$i-1] .= "<a href=\"";
    }
}
$txt = implode("http",$holder);

echo $txt;

Output:

Test text <a href="http://hello.world" target="_blank" class="link">http://hello.world</a> Test text 
<a href="http://google.com/file.jpg" target="_blank" class="link">http://google.com/file.jpg</a> Test text <a href="https://hell.o.wor.ld/test?qwe=qwe" target="_blank" class="link">https://hell.o.wor.ld/test?qwe=qwe</a> Test text 
test text <a href="http://test.test/test" target="_blank" class="link">http://test.test/test</a> <a href="http://example.com">I am already linked up</a>
It was also done in 1927 (<a href="http://test.com/reference" target="_blank" class="link">http://test.com/reference</a>) Also check this out:<a href="http://test/index&amp;t=27" target="_blank" class="link">http://test/index&amp;t=27</a>

score 0 · Answer 9 · answered May 04 '21 at 18:46

For converting URLs to tags, and recognizing URLs without http/https, try the below. It uses preg_replace_callback to avoid the issue in one of the other answers with the same URL appearing multiple times:

  private function convertUrls($string) {
    $url_pattern = '/(((http|https)\:\/\/)|(www\.))[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,}(\:[0-9]+)?(\/\S*)?/';
    return preg_replace_callback($url_pattern,
      function($matches) {
        $match = $matches[0];
        if (strstr($match, ":") === false) {
          $url = "https://$match";
        } else {
          $url = $match;
        }
        return '<a href="' . $url .'" target="_blank">' . $url . '</a>';
      },
      $string);
  }

Ismaeel Akach · Answer 10 · 2019-02-13T22:15:11.783

-1

i use this function

  <?php
    function deteli($string){
        $pos  = strpos($string, 'http');
        $spos = strpos($string, ' ', $pos);
        $lst  = $spos - $pos;
        $bef  = substr($string, 0, $pos);
        $aft  = substr($string, $spos);
        if ($pos == true || $pos == 0) {
            $link = substr($string, $pos, $lst);
            $res  =  $bef . "<a href='" . $link . "' class='link' target='_blank'>link</a>" . $aft . ""; 
            return  $res;
        }
        else{
            return $string;
        }
    }?>

edited Feb 13 '19 at 22:15

answered Feb 13 '19 at 20:58

Ismaeel Akach

1
2

1

Welcome to Stack Overflow! Answers [should consist of](https://stackoverflow.com/help/how-to-answer) more than a mere code-dump. If you think the question is poorly asked, you can flag it or post a comment. – RaminS Feb 13 '19 at 21:12

PHP find all links in the text

10 Answers10

Linked