7

I want to find all links in the text like this:

Test text http://hello.world Test text 
http://google.com/file.jpg Test text https://hell.o.wor.ld/test?qwe=qwe Test text 
test text http://test.test/test

I know i need to use preg_match_all, but have only idea in the head: start search from http|https|ftp and end search where space or end of the text appears, thats all i need really, so all links wiil be found properly.

Anyone can help me with php regexp pattern?

I think i need to use assertions in the end of pattern, but can`t understand their properly usage for now.

Any ideas? Thanx!

Andy Lester
  • 81,480
  • 12
  • 93
  • 144
swamprunner7
  • 1,241
  • 6
  • 16
  • 25
  • Does [This regex][1] that I provided before fit the bill for you? [1]: http://stackoverflow.com/questions/287144/need-a-good-regex-to-convert-urls-to-links-but-leave-existing-links-alone/10500178#10500178 – Matt Apr 29 '14 at 14:04
  • 1
    If you say that it needs to start with either http, https or ftp and end with a space, you could simply use `(?:https?|ftp)://\S+`, note that `\S+` means match a non-whitespace character one or more times. – HamZa Apr 29 '14 at 14:04
  • @HamZa, what is ?: in your pattern? – swamprunner7 Apr 29 '14 at 14:49
  • 1
    @swamprunner7 `(?:)` is a non-capturing group, [check this out](http://stackoverflow.com/questions/3512471/non-capturing-group) and bookmark/add to favorites [this reference](http://stackoverflow.com/questions/22937618/reference-what-does-this-regex-mean) – HamZa Apr 29 '14 at 17:41

10 Answers10

23

I'd go with something simple like ~[a-z]+://\S+~i

  • starts with protocol [a-z]+://
  • \S+ followed by one or more non-whitespaces where \S is a shorthand for [^ \t\r\n\f]
  • used modifier i (PCRE_CASELESS) (possibly not really necessery)

So it could look like this:

$pattern = '~[a-z]+://\S+~';

$str = 'Test text http://hello.world Test text 
http://google.com/file.jpg Test text https://hell.o.wor.ld/test?qwe=qwe Test text 
test text http://test.test/test';

if($num_found = preg_match_all($pattern, $str, $out))
{
  echo "FOUND ".$num_found." LINKS:\n";
  print_r($out[0]);
}

outputs:

FOUND 4 LINKS:
Array
(
    [0] => http://hello.world
    [1] => http://google.com/file.jpg
    [2] => https://hell.o.wor.ld/test?qwe=qwe
    [3] => http://test.test/test
)

Test on eval.in

HamZa
  • 13,530
  • 11
  • 51
  • 70
Jonny 5
  • 11,051
  • 2
  • 20
  • 42
4
function turnUrlIntoHyperlink($string){
    //The Regular Expression filter
    $reg_exUrl = "/(?i)\b((?:https?:\/\/|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'\".,<>?«»“”‘’]))/";

    // Check if there is a url in the text
    if(preg_match_all($reg_exUrl, $string, $url)) {

        // Loop through all matches
        foreach($url[0] as $newLinks){
            if(strstr( $newLinks, ":" ) === false){
                $link = 'http://'.$newLinks;
            }else{
                $link = $newLinks;
            }

            // Create Search and Replace strings
            $search  = $newLinks;
            $replace = '<a href="'.$link.'" title="'.$newLinks.'" target="_blank">'.$link.'</a>';
            $string = str_replace($search, $replace, $string);
        }
    }

    //Return result
    return $string;
}
Youssef NAIT
  • 1,040
  • 9
  • 22
  • It's nice that this method captures URLs without http/https at the beginning, but I believe this method will not work properly if the same URL appears multiple times in the text. – smoyth May 04 '21 at 18:36
2
<?php

// The Regular Expression filter
$reg_exUrl = "/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/";

// The Text you want to filter for urls
$text = "The text you want to filter goes here. http://google.com";

// Check if there is a url in the text
if(preg_match($reg_exUrl, $text, $url)) {

       // make the urls hyper links
       echo preg_replace($reg_exUrl, "<a href="{$url[0]}">{$url[0]}</a> ", $text);

} else {

       // if no urls in the text just return the text
       echo $text;

}
?>

Reference:http://css-tricks.com/snippets/php/find-urls-in-text-make-links/

hellosheikh
  • 2,741
  • 8
  • 43
  • 106
  • There's a comment on this article showing some errors in the above code which are easily fixed. The code works fantastic as long as there is only one URL in the text. But if you add another, it simply keeps repeating the first URL over. – wordman Nov 03 '18 at 16:34
  • Works perfectly. thanks for the code – Eben Watts Mar 14 '21 at 10:28
2

Works like a charm. use this.

$str= "Test text http://hello.world";
preg_match_all('/\b(?:(?:https?|ftp|file):\/\/|www\.|ftp\.)[-A-Z0-9+&@#\/%=~_|$?!:,.]*[A-Z0-9+&@#\/%=~_|$]/i', $str, $result, PREG_PATTERN_ORDER);
print_r($result[0]);
Vinoth Sd
  • 112
  • 5
1

The suggested answers are great, but one of them miss www. case, the other http://

So, let's combine all of those:

$text = Test text http://hello.world Test text 
http://google.com/file.jpg Test text https://hell.o.wor.ld/test?qwe=qwe Test text 
test text http://test.test/test

preg_match_all('/(((http|https|ftp|ftps)\:\/\/)|(www\.))[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\:[0-9]+)?(\/\S*)?/', $text, $results, PREG_PATTERN_ORDER);

print_r($results[0]);

The return value for PREG_PATTERN_ORDER will be Array of Arrays (results) so that $results[0] is an array of full pattern matches, $results[1] is an array of strings matched by the first parenthesized subpattern, and so on.

Johnny
  • 10,849
  • 11
  • 61
  • 105
1

function turnUrlIntoHyperlink($string) { // The Regular Expression filter $reg_exUrl = "/(http|https|ftp|ftps)://[a-zA-Z0-9-.]+.[a-zA-Z]{2,3}(/\S*)?/";

// Check if there is a url in the text
if (preg_match($reg_exUrl, $string, $url)) {
    // make the urls hyper links
    echo preg_replace($reg_exUrl, "<a target='_blank' href='{$url[0]}'>{$url[0]}</a>", $string);
} else {
    // if no urls in the text just return the text
    echo $string;
}

}

Eben Watts
  • 134
  • 5
0

Alternative to regexp it´s use this library

Works very good, butnot for very complex codes.

foreach($html->find('a') as $element) 
       echo $element->href . '<br>';

And easy to use. No regular expressions skills required:-)

David
  • 363
  • 4
  • 23
0

Not regexp, but finds it all and makes sure that they are not already encompassed in a tag already. It also checks to make sure that the link isn't encapsulated in (), [], "" or anything else with an open and close.

$txt = "Test text http://hello.world Test text 
http://google.com/file.jpg Test text https://hell.o.wor.ld/test?qwe=qwe Test text 
test text http://test.test/test <a href=\"http://example.com\">I am already linked up</a>
It was also done in 1927 (http://test.com/reference) Also check this out:http://test/index&t=27";
$holder = explode("http",$txt);
for($i = 1; $i < (count($holder));$i++) {
    if (substr($holder[$i-1],-6) != 'href="') { // this means that the link is not alread in an a tag.
        if (strpos($holder[$i]," ")!==false) //if the link is not the last item in the text block, stop at the first space
            $href = substr($holder[$i],0,strpos($holder[$i]," "));
        else                                //else it is the last item, take it
            $href = $holder[$i];
        if (ctype_punct(substr($holder[$i-1],strlen($holder[$i-1])-1)) && ctype_punct(substr($holder[$i],strlen($holder[$i])-1)))
            $href = substr($href,0,-1);     //if both the fron and back of the link are encapsulated in punctuation, truncate the link by one
        $holder[$i] = implode("$href\" target=\"_blank\" class=\"link\">http$href</a>",explode($href,$holder[$i]));
        $holder[$i-1] .= "<a href=\"";
    }
}
$txt = implode("http",$holder);

echo $txt;

Output:

Test text <a href="http://hello.world" target="_blank" class="link">http://hello.world</a> Test text 
<a href="http://google.com/file.jpg" target="_blank" class="link">http://google.com/file.jpg</a> Test text <a href="https://hell.o.wor.ld/test?qwe=qwe" target="_blank" class="link">https://hell.o.wor.ld/test?qwe=qwe</a> Test text 
test text <a href="http://test.test/test" target="_blank" class="link">http://test.test/test</a> <a href="http://example.com">I am already linked up</a>
It was also done in 1927 (<a href="http://test.com/reference" target="_blank" class="link">http://test.com/reference</a>) Also check this out:<a href="http://test/index&amp;t=27" target="_blank" class="link">http://test/index&amp;t=27</a>
0

For converting URLs to tags, and recognizing URLs without http/https, try the below. It uses preg_replace_callback to avoid the issue in one of the other answers with the same URL appearing multiple times:

  private function convertUrls($string) {
    $url_pattern = '/(((http|https)\:\/\/)|(www\.))[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,}(\:[0-9]+)?(\/\S*)?/';
    return preg_replace_callback($url_pattern,
      function($matches) {
        $match = $matches[0];
        if (strstr($match, ":") === false) {
          $url = "https://$match";
        } else {
          $url = $match;
        }
        return '<a href="' . $url .'" target="_blank">' . $url . '</a>';
      },
      $string);
  }
smoyth
  • 589
  • 4
  • 13
-1

i use this function

  <?php
    function deteli($string){
        $pos  = strpos($string, 'http');
        $spos = strpos($string, ' ', $pos);
        $lst  = $spos - $pos;
        $bef  = substr($string, 0, $pos);
        $aft  = substr($string, $spos);
        if ($pos == true || $pos == 0) {
            $link = substr($string, $pos, $lst);
            $res  =  $bef . "<a href='" . $link . "' class='link' target='_blank'>link</a>" . $aft . ""; 
            return  $res;
        }
        else{
            return $string;
        }
    }?>
  • 1
    Welcome to Stack Overflow! Answers [should consist of](https://stackoverflow.com/help/how-to-answer) more than a mere code-dump. If you think the question is poorly asked, you can flag it or post a comment. – RaminS Feb 13 '19 at 21:12