0

I'm tring to get a string between 2 strings with preg_match

The string is something like this, this is just an example

<source src='http://website.com/384238/dsjfjsd.jpg' type='image/jpg' data-res='43543' lang='English'/>

I want the link, the "data-res=" is the one that varies so:

I'm doing something like this:

preg_match("<source src='(.*)' type='image/jpg' data-res='43543",$input,$output);

I also tried this way

$output = trim(cut_str($input, '<source src='', ' type='image/jpg' data-res='43543'));

I think the problem is not knowing how do I represent the spaces or special chars, I also wanted an advice for whats the best function to solve this

Ricardo
  • 1
  • 3

3 Answers3

1

While you can do this with a regular expression. I would encourage you to use DOMDocument.

From there it would be simple to grab all source tags using getElementByTagName():

$dom = new DOMDocument;
$dom->loadHTML($html);
$source_tags = $dom->getElementsByTagName('source');
foreach ($source_tags as $source_tag) {
    echo 'Link: ' . $source_tag->attributes->getNamedItem('src')->nodeValue;
}

This question might also help if you are interested in source tags with the data-res attribute.

Community
  • 1
  • 1
Jason McCreary
  • 66,624
  • 20
  • 123
  • 167
  • I really wanted a more simple universal solution just to grab just anything between 2 random strings I think "trim(cut_str)" was the simplest function I've found, I think some the apostrophes or other char in the string are messing the code, I'm not sure – Ricardo Feb 09 '15 at 20:16
  • You can write a custom function to wrap this if you want to make it *universal*, but parsing HTML with `DOMDocument` is a *better* way to go. – Jason McCreary Feb 09 '15 at 20:27
0

Why not parsing it like this ? It's faster then REGEX and easier to use.

$dom = new DOMDocument;
$dom->loadHTML('<source src="http://website.com/384238/dsjfjsd.jpg" type="image/jpg" data-res="43543" lang="English" />');

//  We read it
$dataSource = $dom->getElementsByTagName('source');

//  We loop on it
$dataRes = FALSE;
foreach($dataSource as $data){
    #   We read the wanted field
    if(($dataAttr = $data->attributes->getNamedItem('data-res')->nodeValue) == "43543"){
        #   We assign it
        $dataRes&= $dataAttr;

        #   Done - We end the loop here
        break;
    }
}

#   We found it ?
if($dataRes !== FALSE){
    #   Yes
    var_dump($dataRes);
} else {
    #   No
    exit('Failed');
}

Warning: I didn't not test this code but it should work.

David Bélanger
  • 7,237
  • 3
  • 34
  • 55
  • Made it work, thanks, but the thing is I want the "src=" with that has specifically data-res='43543', I replaced your "data-res" for "src" but of course I get a first random because there are many in the string – Ricardo Feb 09 '15 at 20:49
  • Ok, thanks but still missing a point, I wanted the "src" link value that has data-res='43543' and that are between the same tags – Ricardo Feb 09 '15 at 21:04
  • @Ricardo In the IF at the bottom, read `$dataRes` and extract the SRC value: `$dataRes->attributes->getNamedItem('src')->nodeValue` – David Bélanger Feb 09 '15 at 21:16
  • Big thanks, made in a slightly diferent ways with multiple "ifs" inside the foreach in the way it was before you edited – Ricardo Feb 10 '15 at 14:42
  • @Ricardo My pleasure! – David Bélanger Feb 10 '15 at 15:02
0

Here is a code you could try:

// The Regular Expression filter
$reg_exSRC = "/(src)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/";

// The text you want to filter for urls
$text = "<source src='http://website.com/384238/dsjfjsd.jpg' type='image/jpg' data-res='43543' lang='English'/>";

// apply expression to the text
preg_match($reg_exSRC, $text, $url);

echo $url[0];