Skip to content
Advertisement

XPath do not retrieve some content

Im a a newbie trying to code a crawler to make some stats from a forum.

Here is my code :

JavaScript

As you can see, the code do not retrieve some content. For example, Im trying to retrieve this content from : http://m.jeuxvideo.com/forums/42-51-61913988-1-0-1-0-je-code-un-bot-pour-le-forom-je-vous-le-montre-en-action.htm

The second post is a picture and my code do not work.

On the second hand, I guess i made some errors, I find my code ugly.

Can you help me please ?

Advertisement

Answer

You could simply select the posts first, then grab each subdata separately using:

Code:

JavaScript

Unrelated note: scraping a website’s HTML is not illegal in itself, but you should refrain from displaying their data on your own app/website without their consent. Also, this might break just about anytime if they decide to alter their HTML structure/CSS class names.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement