I’m scraping the data from the website in order to scrape further data i need to solve captcha that i’m thinking of giving user to solve but site uses language PHP after some digging site is using PHP-GD that i need to scrap as image but URL giving me some values that i don’t know how to procede
URL something like : <img src="www.some.urk/captcha.php" />
in img tag i followed the URL than i don’t understand the data how to construct an image from that data
here is the data i revive from the URL
HEADER:
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9 Accept-Encoding: gzip, deflate Accept-Language: en-US,en;q=0.9 Cache-Control: max-age=0 Connection: keep-alive Cookie: PHPSESSID=8o58tnqgupo4h5si8499nij5m6
BODY:
����JFIF��;CREATOR: gd-jpeg v1.0 (using IJG JPEG v90), quality = 80 ��C %# , #&')*)-0-(0%()(��C (((((((((((((((((((((((((((((((((((((((((((((((((((��A"�� ���}!1AQa"q2���#B��R��$3br� %&'()*456789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz��������������������������������������������������������������������������� ���w!1AQaq"2�B���� #3R�br� $4�%�&'()*56789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz��������������������������������������������������������������������������?�R���߈uO _�}��}���{qp��I��cldo�����>���6��,���ūMeu�'6��i��(مb 0b����p�详���d�֗}���=���Mn����Q��q� �vʀU1gx��%���� ����+K�5.5�4�b�G��]�[a���@�3�1����x%캶���쩧�z}�����m�j��Y�d,Ĵ�����U������ ����r�h�:T-m7���R;������FA����8P�h��S�:���^<Os��Ki+DIFxܩ*H����([�~!�<5%�u�K,/m*�k,sD� ��*�����z�V��뺴��7�O�VM�vP�i��Z4�UU�rX�O��x��> �=���/b{ct����6�-ÀiH�� csd�y��]s����*�O��?:����S��Q�w�ؼ�UD�m7�r��R����?�����o���z���&�������V�7ofw���4?�>'���4�3S�m��ﲻ[�$֞r�ȕ����c/$����(��(��(���
here tried but not the expected result you can see here link 1
Advertisement
Answer
After searching some library i find the following solution with JIMP library
function hello (){ request(url, function(error, response, html){ if(!error){ var $ = cheerio.load(html); var img = $('img.control-label'); var img_url = $('img.control-label').attr('src') console.log(img); console.log(url+img_url); console.log(response.headers); Jimp.read({ url: url+img_url, // Required! headers: {Cookie: response.headers}, }) .then(image => { // Do stuff with the image.' console.log("successes : "+image); image.write("path.png"); }) .catch(err => { console.log(err); }); } }) } hello();