I want the commented items from an external website. I cannot edit the website.
The website looks like this, i editted a lot of things out but this is the path from the body:
<body> <div class="js-standby-status tn-relative"> <div class="tn-container serp-mobile-container"> <div class="tn-row tn-row-sm-spacing search-page"> <section class="span-d-9 section-main-content-SERP"> <div id="js-products"> <section class="js-results-wrapper"> <section> <ul class="product-list main-product-list-wrapper"> <li class="product-list-item product-list-item-first standby-status"> <article> <div class="sl-search-result mobile-search-result"> <a class="sl-search-result-link" href="$url"></a> <!-- Link is needed --> <div class="search-result-body"> <a class="top-item-title_wrapper"> <h2 class="search-result-name"> Dunlop Winter Sport 5 </h2> <span itemprop="mpn"> 5452000470454 </span> </a> <div class="tn-row"> <div class="span-d-6"> <div class="product-description"> <ul class="search-result-desc-list"> <li class="search-result-desc-list-item" title="205/55 R16 91H"> 205/55 R16 91H </li> </ul> </div> </div> </div> </div> </div> </article> </li> </ul> </section> </section> </div> </section> </div> </div> </div> </body> </html>
I am using PHP HTML Simple Dom and PHP 7.3.
I am currently using this code to get the information from the website:
$html = file_get_html($url); if(!empty($html)){ $content_url = $html->find(".product-list-item", 0)->find('.sl-search-result', 0)->find('.sl-search-result-link', 0)->getAttribute('href', 0); $content_naam = $html->find(".product-list-item", 0)->find('.sl-search-result', 0)->find('.sl-search-result-link')->find('.search-result-body', 0)->find('.top-item-title_wrapper', 0)->find('.search-result-name', 0)->plaintext; $content_ean = $html->find(".product-list-item", 0)->find('.sl-search-result', 0)->find('.sl-search-result-link')->find('.search-result-body', 0)->find('.top-item-title_wrapper', 0)->find("span[itemprop='mpn")->plaintext; $content_maat = $html->find(".product-list-item", 0)->find('.sl-search-result', 0)->find('.sl-search-result-link')->find('.search-result-body', 0)->find('.tn-row', 0)->find('span-d-6', 0)->find('.product-description')->find('.search-result-desc-list')->find('.search-result-desc-list-item')->plaintext; if(!empty($content_url)){ if(!empty($content_naam)){ if(!empty($content_ean)){ if(!empty($content_maat)){ echo $item . ". <a href='" . $content_url . "'>EAN: " . $content_ean . " Product naam: " . $content_naam ."</a><br/>"; }else{ echo "Content maat is empty."; } }else{ echo "Content ean is empty"; } }else{ echo "Content naam is empty"; } }else{ echo "Content URL is empty"; } }else{ echo "No HTML found!"; } }
I get an error in the script, not on the website but in the logs of my apache2 server. See below:
Uncaught Error: Call to a member function find() on null in /var/www/html/scraper/bandenNL.php:30nStack trace:n#0 {main}n thrown in /var/www/html/scraper/bandenNL.php on line 30
Do you want more information, just comment.
Advertisement
Answer
If I understand you correctly, something along these lines should get you close enough to what you are looking for:
include('simple_html_dom.php'); $htmlDoc = new DOMDocument(); $htmlDoc->loadXML($html); $xpath = new DOMXpath($htmlDoc); $link = $xpath->query("//a[@href]/@href"); $name = $xpath->query('//h2[@class="search-result-name"]/text()'); $mpn = $xpath->query('//span[@itemprop="mpn"]/text()'); $title_attr = $xpath->query('//li[@class="search-result-desc-list-item"]/@title'); $title = $xpath->query('//li[@class="search-result-desc-list-item"]/text()'); echo "Link: ". $link[0]->textContent . "<br>"; echo "Name: ". $name[0]->textContent . "<br>"; echo "MPN: ". $mpn[0]->textContent . "<br>"; echo "Title attribure: ". $title_attr[0]->textContent . "<br>"; echo "Title: ". $title[0]->textContent . "<br>";
Output:
Link: my.url Name: Dunlop Winter Sport 5 MPN: 5452000470454 Title attribure: 205/55 R16 91H Title: 205/55 R16 91H