how to dom html url with php?

Question

This is the URL that I want to parse: http://www.tsetmc.com/Loader.aspx?ParTree=151313&Flow=0 I use simple_html_dom.php but it can't read the HTML because the HTML is encoded. So I think I should parse online and webpage source. Is there any way that I can parse this web site? The source code looks like this: my code: Answer The issue, as you pointed out

Accepted Answer

The issue, as you pointed out was the encoding, it&#8217;s gzip encoded. You can set the flag in curl CURLOPT_ENCODING to work around that. What it does, as provided by php-curl documentation:  The contents of the &#8220;Accept-Encoding: &#8221; header. This enables decoding of the response. Supported encodings are &#8220;identity&#8221;, &#8220;deflate&#8221;, and &#8220;gzip&#8221;. If an empty string, &#8220;&#8221;, is set, a header containing all supported encoding types is sent.   Use the following php-curl code to get the response html like this:<?php$curl = curl_init();curl_setopt_array($curl, array(  CURLOPT_URL => "http://www.tsetmc.com/Loader.aspx?ParTree=151313&Flow=0",  CURLOPT_RETURNTRANSFER => true,  CURLOPT_ENCODING => "gzip",  CURLOPT_MAXREDIRS => 10,  CURLOPT_TIMEOUT => 0,  CURLOPT_FOLLOWLOCATION => true,  CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,  CURLOPT_CUSTOMREQUEST => "GET",));$response = curl_exec($curl);curl_close($curl);echo $response;?>Then you can use the response html $response directly in simple_html_dom.php to parse the dom tree.Here&#8217;s a working version of the code.http://phpfiddle.org/main/code/gb66-3kzq

Advertisement

Answer