Skip to content
Advertisement

Tag: web-scraping

In Symfony/Panther when scraping, waitfor function will throw exception if it timesout – i need it to continue if item is not found

I have a database of clinics, and an url to each clinic. All clinic pages are the same in terms of html/css, with different content to scrape. However, some clinics have no content on their page, and this causes trouble for me. I have: If .facility is not present, the waitFor() will throw exception because of timeout. I need to

How to crawl page in PHP?

I get the error: “error code: 1020″. The page I’m trying to crawl for form data is: https://v2.gcchmc.org/medical-status-search/. This is my code: $initial = file_get_contents(‘https://v2.gcchmc.org/medical-status-search/’); $check = preg_replace(‘/.+?input type=”hidden” name=”csrfmiddlewaretoken” value=”(.+?)”.*/sim’, ‘$1’. $initial); print $check; “error code: 1020” the page I am trying to crawl for form data is https://v2.gcchmc.org/medical-status-search/. Can you help me what’s wrong in the code below.

How to get price value with regular expressions

I am trying to write a crawler for an Online Store and now I need to get the price value of the webpage. Here is my try: Basically $html holds the source code of the webpage and the price value is stored at the document like this: <div class=”c-product__seller-price-pure js-price-value”>10,699,000</div> But when I run this I get this as result:

Extract links from a list of urls

I am trying to extract all the links from a set list of or urls in a text file and save the extracted links in another text file. I am trying to use the script below which was originally meant to extract Emails: I changed the the email extract part to extract links like this: Here is the full code:

Getting a nested element PHP HTML Simple Dom

I want the commented items from an external website. I cannot edit the website. The website looks like this, i editted a lot of things out but this is the path from the body: I am using PHP HTML Simple Dom and PHP 7.3. I am currently using this code to get the information from the website: I get an

Advertisement