I am writing a scraper for office task: My Goutte Client code is like:
$cokie = "JSESSIONID=0000H_WHw_eFPKVUDGxUei7v3PH:1db7cfi4s"; $client = new Client(HttpClient::create(array( 'headers' => array( 'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 'Accept-Language' => 'en-US,en;q=0.5', 'Connection' => 'keep-alive', 'Host' => 'verification.nadra.gov.pk', "Cookie" => $cokie, 'User-Agent' => 'Mozilla/5.0 (Windows NT x.y; Win64; x64; rv:10.0) Gecko/20100101 Firefox/10.0' ), ))); $cookie = new Cookie("JSESSIONID", $cokie, null, "/service", "https://example.com/", true, true); $client->getCookieJar()->set($cookie); $client->setServerParameter('HTTP_USER_AGENT', 'Mozilla/5.0 (Windows NT x.y; Win64; x64; rv:10.0) Gecko/20100101 Firefox/10.0'); $client->followRedirects(true); $crawler = $client->request('GET', 'https://example.com/service/botdetectcaptcha?get=image&c=exampleCaptcha&t=508c5eaf74fd4858b0c9debafc319d67');
I have to send request with cookie to get the proper content.
https://example.com/service/botdetectcaptcha?get=image&c=exampleCaptcha&t=508c5eaf74fd4858b0c9debafc319d67
Result is again same url:
<html> <head> <title>botdetectcaptcha (JPEG Image, 250 × 40 pixels)</title></head> <body><img src="https://example.com/service/botdetectcaptcha?get=image&c=exampleCaptcha&t=508c5eaf74fd4858b0c9debafc319d67" alt="https://example.com/service/botdetectcaptcha?get=image&c=exampleCaptcha&t=508c5eaf74fd4858b0c9debafc319d67"> </body> </html>
in browser it is working fine but the issue is when I get the image from this url it again generate new image without cookie that is way it is not working.
I have tried the below:
base64_encode(file_get_contents("https://example.com/service/botdetectcaptcha?get=image&c=exampleCaptcha&t=508c5eaf74fd4858b0c9debafc319d67"));
above send the GET request without cookie that is way received image does not work for me.
Advertisement
Answer
I have done using file_get_contents by sending same client information as I am sending in Goutte Client
$url = "https://example.com/service/botdetectcaptcha?get=image&c=exampleCaptcha&t=9d15db63ddc449f1850aad6e3183ce2e"; $options = array( 'http'=>array( 'method'=>"GET", 'header'=>"Accept-language: enrn" . "Cookie: " . $cookie_value ."rn" . // check function.stream-context-create on php.net "Host: https://example.com/rn" . "User-Agent: Mozilla/5.0 (Windows NT 6.2; Win64; x64; rv:80.0) Gecko/20100101 Firefox/80.0rn" // i.e. An iPad ) ); $context = stream_context_create($options); $img_base64 = base64_encode(file_get_contents($url, false, $context)); file_put_contents('img/img_9d15db63ddc449f1850aad6e3183ce2e.png', base64_decode($img_base64));