Skip to content
Advertisement

Simple HTML DOM returning underscores

Alright, i am using Simple HTML DOM (https://simplehtmldom.sourceforge.io/) to get some data from a page.

The data i would like to get are these selector options:

<select name="element" class="selectbox">
    <option value="1">O18FB-B1</option>
    <option value="2">O18FB-B2</option>
    <option value="3">O18FB-D1</option>
    <option value="4">O18FB-D2</option>
    <option value="5">O19BA-B</option>
    <option value="6">O19BA-C1</option>
    <option value="7">O19BA-C2</option>
    <option value="8">O19FAMA</option>
    <option value="9">O20BA-D1</option>
    <option value="10">O20BA-D2</option>
    <option value="11">O20BA-E1</option>
    <option value="12">O20BA-E2</option>
    <option value="13">O20FAMA1</option>
    <option value="14">O20FAMA2</option>
    <option value="15">O20FAMB1</option>
    <option value="16">O20FAMB2</option>
    <option value="17">O18AO-A</option>
    <option value="18">O18IB-A</option>
    <option value="19">O19AO-A</option>
    <option value="20">O19IB-A</option>
    <option value="21">O19MBIA</option>
    <option value="22">O20AMTA</option>
    <option value="23">O20EITA</option>
    <option value="24">O20EITB</option>
    <option value="25">O20SD-A</option>
    <option value="26">O20SD-B</option>
    <option value="27">O20SD-C</option>
    <option value="28">O18JD-A</option>
    <option value="29">O18JD-B</option>
    <option value="30">O18JD-C</option>
    <option value="31">O19JD-A</option>
    <option value="32">O19JD-B</option>
    <option value="33">O19JD-C</option>
    <option value="34">O20JD-A</option>
    <option value="35">O20JD-B</option>
    <option value="36">O20JD-C</option>
    <option value="37">O18MMCA</option>
    <option value="38">O18MMCB</option>
    <option value="39">O19MFAA</option>
    <option value="40">O19MMCA</option>
    <option value="41">O19MMCB</option>
    <option value="42">O19MSRA</option>
    <option value="43">O20MMCA</option>
    <option value="44">O20MMCB</option>
    <option value="45">O20MMCC</option>
    <option value="46">O20OABA</option>
    <option value="47">O20OABB</option>
    <option value="48">O18OMSA</option>
    <option value="49">O19OMSA</option>
    <option value="50">O20OMSA</option>
    <option value="51">O20OMSB</option>
    <option value="52">O18CT</option>
    <option value="53">O19CT</option>
    <option value="54">O20CT</option>
</select>

I am using this code to get it:

protected $html = 'https://rooster.horizoncollege.nl/rstr/ECO/AMR/400-ECO/Roosters/frames/navbar.htm';    

public function getClasses()
    {
        $data = file_get_html($this->html);
        $classes = $data->find('select', 2)->children();
        return $classes;
    }

And then this code to display it on my page:

$options = new GetOptions();
$classes = $options->getClasses();
foreach ($classes as $class) {
    $optionFormat = '<option value="%s">%s</option>';
    $optionValue = $class->value;
    $optionText = $class->plaintext;
    echo sprintf($optionFormat, $optionValue, $optionText);
}

But when i look at my page this is what i get (literally copied from chrome inspect element):

<select name="class" class="selector w-full block border border-gray-400 rounded-lg p-3 outline-none">
                        ________________________                               </select>

Does anyone know what i am doing wrong? I have looked at the page and there is no select element with this content in it… Thanks in advance.

Advertisement

Answer

Your code is correct but data is not there.

Please look at source of your page. Not in inspector but just raw source that is coming to your browser at first. In chrome you can do this with ctrl + u on windows (view source). This way you will see that page that you are requesting doesn’t contain any values in html select item when it comes to the browser. This values are populated later with javascript functions but unfortunately Simple HTML DOM doesn’t run javascript so scraping it is not possible with this library.

HTML Source Code

You need to look for something that can run javascript. Probably some headless browser would be an option. If you need to stick with PHP you can start by looking here: https://github.com/symfony/panther or here: https://github.com/php-webdriver/php-webdriver

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement