Skip to content
Advertisement

Can a PDF file have 0 pages defined or otherwise result in 0 as page size?

I have a PHP script using Imagick, but there is the risk of a NAN error, should a PDF file provided by a user contain no pages or have a page with no height or no width. I am not sure if this is possible in a PDF structure. Also making a jpeg from a page number larger than the total pages will cause an error. Is it generally possible a valid PDF file wrapper is sent but without actual page content?

The core question: How can we count and measure pages for a proper error capture before entering the conversion from PDF to JPEG?

In the function below I assume it might be possible to have 0 height or 0 width. And use the code if($imH==0){$imH=1;} but having code based on an assumption doesn’t feel right.

parts of the function were adopted from an article by umidjons: https://gist.github.com/umidjons/11037635

PHP code:

JavaScript

call the function e.g. like:

JavaScript

Advertisement

Answer

Sure, a PDF file is a container format that can contain pretty much anything, including (only) metadata with 0 pages. But even so, with this code it’s quite possible to request a thumbnail for page 21 on a document that only contains 5 pages.

If that happens, the problem will occur on this line:

JavaScript

This will throw an exception if the provided page does not exist. You can catch that exception and handle it however you want:

JavaScript

If you want to read the number of pages beforehand, you can try to let Imagick parse the document first:

JavaScript

The function name is a bit misleading, see this comment in the PHP manual:

“For PDFs this function indicates the number of pages on the PDF, NOT images that might be embedded within the PDF.”

Here as well, if the PDF document is invalid in some way, this can throw an exception so you might want to catch that and handle it:

JavaScript
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement