I’m scraping the data from the website in order to scrape further data i need to solve captcha that i’m thinking of giving user to solve but site uses language PHP after some digging site is using PHP-GD that i need to scrap as image but URL giving me some values that i don’t know how to procede
JavaScript
x
URL something like : <img src="www.some.urk/captcha.php" />
in img tag i followed the URL than i don’t understand the data how to construct an image from that data
here is the data i revive from the URL
HEADER:
JavaScript
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.9
Cache-Control: max-age=0
Connection: keep-alive
Cookie: PHPSESSID=8o58tnqgupo4h5si8499nij5m6
BODY:
JavaScript
����JFIF��;CREATOR: gd-jpeg v1.0 (using IJG JPEG v90), quality =
80
��C
%# , #&')*)-0-(0%()(��C
(((((((((((((((((((((((((((((((((((((((((((((((((((��A"��
���}!1AQa"q2���#B��R��$3br�
%&'()*456789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz���������������������������������������������������������������������������
���w!1AQaq"2�B���� #3R�br�
$4�%�&'()*56789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz��������������������������������������������������������������������������?�R���߈uO
_�}��}���{qp��I��cldo�����>���6��,���ūMeu�'6��i��(مb
0b����p�详���d�֗}���=���Mn����Q��q� �vʀU1gx��%����
����+K�5.5�4�b�G��]�[a���@�3�1����x%캶���쩧�z}�����m�j��Y�d,Ĵ�����U������
����r�h�:T-m7���R;������FA����8P�h��S�:���^<Os��Ki+DIFxܩ*H����([�~!�<5%�u�K,/m*�k,sD�
��*�����z�V��뺴��7�O�VM�vP�i��Z4�UU�rX�O��x��> �=���/b{ct����6�-ÀiH��
csd�y��]s����*�O��?:����S��Q�w�ؼ�UD�m7�r��R����?�����o���z���&�������V�7ofw���4?�>'���4�3S�m��ﲻ[�$֞r�ȕ����c/$����(��(��(���
here tried but not the expected result you can see here link 1
Advertisement
Answer
After searching some library i find the following solution with JIMP library
JavaScript
function hello (){
request(url, function(error, response, html){
if(!error){
var $ = cheerio.load(html);
var img = $('img.control-label');
var img_url = $('img.control-label').attr('src')
console.log(img);
console.log(url+img_url);
console.log(response.headers);
Jimp.read({
url: url+img_url, // Required!
headers: {Cookie: response.headers},
})
.then(image => {
// Do stuff with the image.'
console.log("successes : "+image);
image.write("path.png");
})
.catch(err => {
console.log(err);
});
}
})
}
hello();