Skip to content
Advertisement

How to disguise your PHP script as a browser?

We’ve been using information from a site for a while now (something that the site allows if you mention the source and we do) and we’ve been copying the information by hand. As you could imagine this can become tedious pretty fast so I’ve been trying to automate the process by fetching the information with a PHP script.

The URL I’m trying to fetch is:

http://mediaforest.ro/weeklycharts/viewchart.aspx?r=WeeklyChartRadioLocal&y=2010&w=46 08-11-10 14-11-10

If I enter it in a browser it works, if I try a file_get_contents() I get Bad Request

I figured that they checked to see if the client is a browser so I rolled a CURL based solution:

$ch = curl_init();

$header=array(
  'User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12',
  'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
  'Accept-Language: en-us,en;q=0.5',
  'Accept-Encoding: gzip,deflate',
  'Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7',
  'Keep-Alive: 115',
  'Connection: keep-alive',
);

curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch,CURLOPT_COOKIEFILE,'cookies.txt');
curl_setopt($ch,CURLOPT_COOKIEJAR,'cookies.txt');
curl_setopt($ch,CURLOPT_HTTPHEADER,$header);
$result=curl_exec($ch);

curl_close($ch);

I’ve checked and the headers are identical with my browser’s headers and I still get Bad Request

So I tried another solution:

http://www.php.net/manual/en/function.curl-setopt.php#78046

Unfortunately this doesn’t work either and I’m out of ideas. What am I missing?

Advertisement

Answer

Try escaping your URL, it works for me that way.

http://mediaforest.ro/weeklycharts/viewchart.aspx?r=WeeklyChartRadioLocal&y=2010&w=46%2008-11-10%2014-11-10
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement