0

Here is a preety presentable example of what i want to do dynamically

Suppose someone enters a string in a textarea like this

"The best search engine is www.google.com."

or maybe

"The best search engine is https://www.google.co.in/?gfe_rd=cr&ei=FLB1U4HHG6aJ8Qfc1YHIBA."

Then i want to highlight the link as stackoverflow does. And also i want to file_get_contents to get one image , a short description and title of the page.

Most probably i wanna check if the string contains a url or not -> two times.

  • On keyup of textarea using jQuery and therefore using the get_file_contents
  • When the string is recieved by php.

Possibly how can i do this?

UPDATE

function parseHyperlinks($text) {
// The Regular Expression filter
$reg_exUrl1 = "/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/";
$reg_exUrl2 = "/[\w\d\.]+\.(com|org|ca|net|uk)/";
// The Text you want to filter for urls

// Check if there is a url in the text
if(preg_match($reg_exUrl1, $text, $url)) {

       // make the urls hyper links
       return preg_replace($reg_exUrl1, "<a class=\"content-link link\" href=\"{$url[0]}\">{$url[0]}</a> ", $text);

} else if(preg_match($reg_exUrl2, $text, $url)){

       return preg_replace($reg_exUrl2, "<a class=\"content-link link\" href=\"{$url[0]}\">{$url[0]}</a> ", $text);

}else{

       // if no urls in the text just return the text
       return $text;

}
}
  • This works only if $str='www.google.com is the best' or $str='http://www.google.com is best' but not if $str='http://stackoverflow.com/ and www.google.com is the best'
HackerManiac
  • 245
  • 1
  • 3
  • 20
  • 1
    Please make an effort to solve the problem yourself, and show us what you've tried, before asking for help here. StackOverflow is not a service for writing free code. – Blazemonger May 16 '14 at 18:14
  • updated @Blazemonger plz take a look – HackerManiac May 16 '14 at 18:20
  • http://stackoverflow.com/questions/161738/what-is-the-best-regular-expression-to-check-if-a-string-is-a-valid-url – Blazemonger May 16 '14 at 18:40
  • and how do `file_get_contents` to get one image , a short description and title of the page. ? @Blazemonger – HackerManiac May 16 '14 at 18:47
  • http://php.net/function.file-get-contents -- if you're asking how to parse the resulting document and extract certain elements, that's a much broader problem. – Blazemonger May 16 '14 at 18:49
  • That's a different question than your original one, and should be posted separately (after you take some time to research it yourself, of course). – Blazemonger May 16 '14 at 18:52

1 Answers1

1

First off you create the html then you need to an AJAX to request to the server. Consider this sample codes:

HTML/jQuery:

<!-- instead of textarea, you could use an editable div for styling highlights, or if you want, just use a plugin -->
<div id="textarea" 
    style="
    font-family: monospace;
    white-space: pre;
    width: 300px;
    height: 200px;
    border: 1px solid #ccc;
    padding: 5px;">For more tech stuff, check out http://www.tomshardware.com/ for news and updates.</div><br/>
<button type="button" id="scrape_site">Scrape</button><br/><br/>
<!-- i just used a button to hook up the scraping, you can just bind it on a keyup/keydown. -->

<div id="site_output" style="width: 500px;">
    <label>Site: <p id="site" style="background-color: gray;"></p></label>
    <label>Title: <p id="title" style="background-color: gray;"></p></label>
    <label>Description: <p id="description" style="background-color: gray;"></p></label>
    <label>Image: <div id="site_image"></div></label>
</div>

<script type="text/javascript" src="jquery.min.js"></script>
<script type="text/javascript">
$(document).ready(function(){

    $('#scrape_site').on('click', function(){
        var value = $.trim($('#textarea').text());
        $('#site, #title, #description').text('');
        $('#site_image').empty();
        $.ajax({
            url: 'index.php', // or you php that will process the text
            type: 'POST',
            data: {scrape: true, text: value},
            dataType: 'JSON',
            success: function(response) {
                $('#site').text(response.url);
                $('#title').text(response.title);
                $('#description').text(response.description);
                $('#site_image').html('<img src="'+response.src+'" id="site_image" />');
            }
        });
    });

    // you can use an editable div so that it can be styled,
    // theres to much code already in the answer, you can just get a highlighter plugin to ease your pain
    $('#textarea').each(function(){
        this.contentEditable = true;
    });

});
</script>

And on your php that will process, in this case (index.php):

if(isset($_POST['scrape'])) {

    $text = $_POST['text'];

    // EXTRACT URL
    $reg_exurl = "/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/";
    preg_match_all($reg_exurl, $text, $matches);
    $usedPatterns = array();
    $url = '';
    foreach($matches[0] as $pattern){
        if(!array_key_exists($pattern, $usedPatterns)){
            $usedPatterns[$pattern] = true;
            $url = $pattern;
        }
    }

    // EXTRACT VALUES (scraping of title and descriptions)

    $doc = new DOMDocument();
    $doc->loadHTMLFile($url);
    $xpath = new DOMXPath($doc);
    $title = $xpath->query('//title')->item(0)->nodeValue;
    $description = $xpath->query('/html/head/meta[@name="description"]/@content');
    if ($description->length == 0) {
        $description = "No description meta tag :(";
        // Found one or more descriptions, loop over them
    } else {
        foreach ($description as $info) {
            $description = $info->value . PHP_EOL;
        }
    }

    $data['description'] = $description;
    $data['title'] = $title;
    $data['url'] = $url;

    // SCRAPING OF IMAGE (the weirdest part)
    $image_found = false;
    $data['src'] = '';
    $images = array();

    // get all possible images and this is a little BIT TOUGH
    // check for og:image (facebook), some sites have this, so first lets take a look on this meta
    $facebook_ogimage = $xpath->query("/html/head/meta[@property='og:image']/@content");
    foreach($facebook_ogimage as $ogimage) {
        $data['src'] = $ogimage->nodeValue;
        $image_found = true;
    }

    // desperation search (get images)
    if(!$image_found) {
        $image_list = $xpath->query("//img[@src]");
        for($i=0;$i<$image_list->length; $i++){
            if(strpos($image_list->item($i)->getAttribute("src"), 'ad') === false) {
                $images[] = $image_list->item($i)->getAttribute("src");
            }
        }

        if(count($images) > 0) {
            // if at least one, get it
            $data['src'] = $images[0];
        }
    }

    echo json_encode($data);
    exit;

}
?>

Note: Although this is not perfect, you can just use this as a reference to just improved on it and make it more dynamic as you could.

user1978142
  • 7,915
  • 3
  • 15
  • 20