Skip to content
Advertisement

Grabbing specific elements inside DIV from external page

I need to scrap the following elements inside each one of these div’s class="product-grid-item" (page contains several of them), but in fact I have no clue how to do it… so, I need help not to pull my hair out.

1 – The link and image inside the div: class="product-element-top2;

<a href="https://...this_link" class="product-image-link"> (just need the link)

<img width="300" height="300" src="https://...this_image_url... (just need this image URL)

2 – The title inside the h3 tag as follows;

<h3 class="wd-entities-title"><a href="https://...linkhere">The title goes here (just the title)

3 – Last but not least, I need to grab tha price inside this;

<span class="price"><span class="woocommerce-Price-amount amount"><bdi><span class="woocommerce-Price-currencySymbol">€</span>20,00</bdi></span></span> (just the “€20.00”)

Here’s the full HTML:

JavaScript

One of my clumsy attempts:

JavaScript

Advertisement

Answer

Assuming that the HTML is being fetched correctly prior to attempting any DOM processing then it is fairly straightforward to construct some basic XPath expressions to find the indicated content.

As per the comment page contains several of them there are 2 product-grid-item divs as you’ll note in the output.

JavaScript

To process the downloaded HTML

JavaScript

Which yields:

JavaScript
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement