16

Here is what I am trying to achieve : retrieve all products on a page and put them into an array. Here is the code I am using :

$page2 = curl_exec($ch);
$doc = new DOMDocument();
@$doc->loadHTML($page2);
$nodes = $doc->getElementsByTagName('title');
$noders = $doc->getElementsByClassName('productImage');
$title = $nodes->item(0)->nodeValue;
$product = $noders->item(0)->imageObject.src;

It works for the $title but not for the product. For info, in the HTML code the img tag looks like this :

<img alt="" class="productImage" data-altimages="" src="xxxx">

I have been looking at this (PHP DOMDocument how to get element?) but I still don't understand how to make it work.

PS : I get this error :

Call to undefined method DOMDocument::getElementsByclassName()

Community
  • 1
  • 1
justberare
  • 901
  • 1
  • 8
  • 26

3 Answers3

37

I finally used the following solution :

    $classname="blockProduct";
    $finder = new DomXPath($doc);
    $spaner = $finder->query("//*[contains(@class, '$classname')]");
justberare
  • 901
  • 1
  • 8
  • 26
  • 1
    A more correct xpath variant is presented as an answer in a duplicated question: [Call to undefined method DOMDocument::getElementsByClassName()](http://stackoverflow.com/a/33446305/367456) – hakre Oct 31 '15 at 16:13
12

https://stackoverflow.com/a/31616848/3068233

Linking this answer as it helped me the most with this problem.

function getElementsByClass(&$parentNode, $tagName, $className) {
    $nodes=array();

    $childNodeList = $parentNode->getElementsByTagName($tagName);
    for ($i = 0; $i < $childNodeList->length; $i++) {
        $temp = $childNodeList->item($i);
        if (stripos($temp->getAttribute('class'), $className) !== false) {
            $nodes[]=$temp;
        }
    }

    return $nodes;
}

Theres the code and heres the usage

$dom = new DOMDocument('1.0', 'utf-8');
$dom->loadHTML($html);
$content_node=$dom->getElementById("content_node");

$div_a_class_nodes=getElementsByClass($content_node, 'div', 'a');
Community
  • 1
  • 1
Ulad Kasach
  • 7,372
  • 7
  • 41
  • 67
  • 2
    watch out, the stripos check in that function can result in false positives.. If you have a class like FormRowHeader then it would still return true for FormRow. – Robert Sinclair May 10 '19 at 19:58
6
function getElementsByClassName($dom, $ClassName, $tagName=null) {
    if($tagName){
        $Elements = $dom->getElementsByTagName($tagName);
    }else {
        $Elements = $dom->getElementsByTagName("*");
    }
    $Matched = array();
    for($i=0;$i<$Elements->length;$i++) {
        if($Elements->item($i)->attributes->getNamedItem('class')){
            if($Elements->item($i)->attributes->getNamedItem('class')->nodeValue == $ClassName) {
                $Matched[]=$Elements->item($i);
            }
        }
    }
    return $Matched;
}

// usage

    $dom = new \DOMDocument('1.0'); 
    @$dom->loadHTML($html);
    $elementsByClass = getElementsByClassName($dom, $className, 'h1');
Mahbub Alam
  • 300
  • 2
  • 5