Skip to content
Advertisement

encoding language fails

PHP code below fails to retrieve correct characters when used :

echo $html = file_get_contents("http://www.tsetmc.com/tsev2/data/instinfofast.aspx?i=65883838195688438&c=34+");

the result is :

����%PKJDA��ۈ�0�o’�z��W�”�7o�E��J:�%�+�=o�h@Ĥ�T�Jv�L�$��IT��1҈IY �B L�g�Mt����� �S]>>�����������j#�Tu97������@”jD��C�3×0�����I”(“D�W��Bd��9������J�^ȑ���T��[e��K����r�ZB����r�Z޼#�w��4G� � �C�b�%8��PR�/���ع���a=�o��s���H�G�

Advertisement

Answer

This is because the output is ‘gzip’ed, you need to ‘unzip’ it (see ‘Content-Encoding’):

D:Temp>curl -v "http://www.tsetmc.com/tsev2/data/instinfofast.aspx?i=65883838195688438&c=34+" -o output.data
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 79.175.151.173...
* TCP_NODELAY set
* Connected to www.tsetmc.com (79.175.151.173) port 80 (#0)
> GET /tsev2/data/instinfofast.aspx?i=65883838195688438&c=34+ HTTP/1.1
> Host: www.tsetmc.com
> User-Agent: curl/7.55.1
> Accept: */*
>
< HTTP/1.1 200 OK
< Cache-Control: public, max-age=1
< Content-Type: text/html; charset=utf-8
< Content-Encoding: gzip
< Expires: Sat, 21 Dec 2019 09:43:48 GMT
< Last-Modified: Sat, 21 Dec 2019 09:43:47 GMT
< Vary: *
< Server: Microsoft-IIS/10.0
< X-Powered-By: ASP.NET
< X-Powered-By: ARR/3.0
< X-Powered-By: ASP.NET
< Date: Sat, 21 Dec 2019 09:42:59 GMT
< Content-Length: 155
<
{ [155 bytes data]
100   155  100   155    0     0    155      0  0:00:01 --:--:--  0:00:01   662
* Connection #0 to host www.tsetmc.com left intact

D:Temp>

unzipping (on Windows):

D:Temp>"c:Program Files7-Zip7z.exe" x output.data output

7-Zip 18.05 (x64) : Copyright (c) 1999-2018 Igor Pavlov : 2018-04-30

Scanning the drive for archives:
1 file, 155 bytes (1 KiB)

Extracting archive: output.data
--
Path = output.data
Type = gzip
Headers Size = 10

Everything is Ok

Size:       239
Compressed: 155

D:Temp>type output
12:29:59,A ,9055,9098,9131,9072,9217,9000,3582,17432646,158598409673,0,20191221,122959;;2@100400@9055@9055@20091@1,2@60000@9050@9058@554@1,1@1000@9040@9059@993@2,;66660,417193,674167;13450748,3981898,0,13913408,3519238,1255,9,0,899,11;;;1; 
D:Temp>
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement