Skip to content
Advertisement

Goutte extract data from every node

hi i want to extract data from every node but i don’t know how to do it and really appreciated if someone give me some guidance

<table>
    <tr>
        <td>item1</td>
        <td>item2</td>
    </tr>
    <tr>
        <td>item3</td>
        <td>item4</td>
    </tr>
</table>

and here it is my php code:

$client = new Client();
    $crawler = $client->request('GET', 'https://www.socom');

    $crawler->filter('.tr')->each(function ($node) {
        print $node->filter('.td')->text()."n";
    });

Advertisement

Answer

You’re in the right way, just you’re referring to your html tags which have the class tr and as I’ve seen in your html you have none, so, that’s why you don’t have “success”.

Check this, you can access to every one of your tr elements and to get the text inside this way:

$crawler->filter('tr')->each(function($node) {
  print_r($node->text());
});

Notice the output is a node so you can’t use echo, and there I’m using just tr to make a reference to the element.

And also you can do this, that’s more seemed maybe to what you wanted to get:

$crawler->filter('tr')->each(function($node) {
  $node->filter('td')->each(function($nested_node) {
    echo $nested_node->text() . "n";
  });
});

This is get all the tr over every tr get its td and then over those td elements get the text inside.

And that’s it, this is the code.

<?php

require __DIR__ . '/vendor/autoload.php';

use GoutteClient;

$client = new Client();

$crawler = $client->request('GET', 'your_url');

$crawler->filter('tr')->each(function($node) {
  print_r($node->text());
});

$crawler->filter('tr')->each(function($node) {
  $node->filter('td')->each(function($nested_node) {
    echo $nested_node->text() . "n";
  });
});

Hope it helps.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement