I would like to detect corrupted PDF using PHP. I have been able to determine that on not corrupted pdf I have the tag “%%EOF” at the end of the file. I also checked for this tag on corrupted and it not appear.
I had the idea to automatically checked the validty of my PDF file before uploading it to my server.
<?php $file = file('good.pdf'); $endfile= $file[count($file) - 1]; echo gettype($endfile),"n"; echo $endfile,"n"; ?>
I get this result
string %%EOF
For now, everything seems to be fine, but I have an issue when comparing the results.
I tested this code
<?php $file = file('good.pdf'); $endfile= $file[count($file) - 1]; $n="%%EOF"; echo $endfile; echo $n; if ($endfile === $n) { echo "good"; } else { echo "corrupted"; } ?>
I get this result
%%EOF %%EOF corrupted
I know that $endfile and $n are string but when i want to compare it, I never get the equality/match. I also tried with == but the result is the same.
I also tried it like that :
<?php $file = file('good.pdf'); $endfile= $file[count($file) - 1]; $var1val = $endfile; $var2val = "%%EOF"; echo $var2val; echo $var1val; $n = strcmp($var1val,$var2val); // 0 mean that they are the same echo $n; if ($n == 0) { echo "good"; } else { echo "corrupted"; } ?>
but I get this result :
%%EOF %%EOF 1 corrupted
It gave me the same result with === .
I only tested with a working and not corrupted pdf. Do you know why this is not working ? Maybe you have other methods using php to check if the pdf is not corrupted before I automatically upload it to my server ?
Advertisement
Answer
Reading http://php.net/manual/en/function.file.php :
Returns the file in an array. Each element of the array corresponds to a line in the file, with the newline still attached. You need to remove the newlines to compare properly.
You need to do something like:
<?php $file = file('good.pdf'); $endfile= trim($file[count($file) - 1]); $n="%%EOF"; if ($endfile === $n) { echo "good"; } else { echo "corrupted"; }