Skip to content
Advertisement

wkhtmltopdf without creating a file in php

I have the wkhtmltopdf module in my Drupal which generates the pdf file by running the 'wkhtmltopdf --options URL filename.pdf' command using shell_exec function.

The output of the file is fine, but I don’t want to store the pdf in the file system. I just want to show the output on the browser so the user can choose whether or not to download it.

As far as I searched, I couldn’t find a way to get the output in the buffer rather than storing it in the pdf file. Is it possible to generate a pdf without creating a file in wkhtmltopdf?

Advertisement

Answer

GIF Demonstration (Over-engineered)

Here is an over-engineered piece of code I wrote just for you 🙂
It includes everything from the function to the demo form you can test out.

I do not guarantee stability with this code, you are free to check it out and modify it for your own use but I can’t guarantee 100% stability or security.

Read the documentation about functions such as shell_exec and why it is a bad practice due to potential security risks.

My recommendation is to write a PHP library in C++ and load that and use it in PHP.

I am not sure if one exists for wkhtmltopdf, someone in the comments correct me if I’m wrong.


Update 1

I tested this script on http://ifconfig.me and it returns a malformed PDF document.
So you perhaps have 3 choices, either write a PHP library in C++, wait for someone to come up with a better solution, or just download the file into /tmp and read the file using PHP then delete it.

GIF Demonstration (Simple)

Code (Simple)

<?php

/**
 * --- DO NOT REMOVE THIS DOCBLOCK ---
 * @WebCrawlTrackingId cf9e8c67.3cb7269c.60b1d84b.5b2e5450
 */

/**
 * @file
 * Code for ni_wkhtmltopdf_simple function.
 * Includes a demonstration at the end.
 */

/**
 * Function that saves a PDF file
 * to a temporary directory and
 * returns it.
 * All of this by using wkhtmltopdf.
 *
 * @author t3ap0t@stackoverflow.com
 *
 * @param string $url
 *     URL to convert
 *
 * @param string $download
 *     Decide whether to download the
 *     file by specifying a filename
 *     or don't specify anything to
 *     display it in the browsers
 *     built-in PDF viewer.
 *
 * @return int|file
 *     Return (int) -1 if URL is empty
 *     Return (int) -2 if URL is not a string
 *     Return (int) -3 if URL is not a URL
 */
function ni_wkhtmltopdf_simple($url = "", $download = false) {
    // URL can't be empty
    if ($url == "") {
        return -1;
    }

    // URL must be a string
    if (gettype($url) !== "string") {
        return -2;
    }

    // Remove whitespace
    $url = trim($url);

    // Explode URL by ':' to Array
    $urla = explode(":", $url, 2);

    // URL must be an actual URL
    if (strtolower(substr($urla[0], 0, 4)) !== "http" || substr($urla[1], 0, 2) !== "//") {
        return -3;
    }
    
    // Escape Shell Arguments
    $url = escapeshellarg($url);

    // Random file name
    $fname = "/tmp/" . bin2hex(random_bytes(10)) . ".pdf";

    // Generate a PDF file
    shell_exec("wkhtmltopdf "$url" "$fname"");

    // Load file
    $buffer = file_get_contents("$fname");
    
    // Delete the file after loading
    unlink("$fname");

    $buffsz = strlen($buffer);

    // Prepare headers
    header("Content-Type:application/pdf");

    if ($download) {
        $download = trim($download);
        header("Content-Disposition:attachment;filename="$download"");
    } else {
        header("Content-Disposition:inline");
    }

    header("Content-Length:" . $buffsz);

    exit($buffer);
}

// Demonstrate ni_wkhtmltopdf_simple

// Are we getting the URL parameter?
if (isset($_GET["url"])) {
    // Convert array to string
    if (is_array($_GET["url"])) {
        $_GET["url"] = $_GET["url"][0];
    }
    
    // Remove whitespace
    $url = trim($_GET["url"]);

    // URL is empty so unset it
    if ($url == "") {
        unset($_GET["url"], $url);
        header("Location:" . basename(__FILE__));
    }

    // Get PDF output
    if (isset($url)) {
        ni_wkhtmltopdf_simple($url);
    }
} else {
?>
<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width">
    <title>PHP wkhtmltopdf_simple Demo (t3ap0t@stackoverflow.com)</title>
    <style>
        *{outline:0}
        html,body{
            zoom:1.25
        }
    </style>
</head>
<body>
    <form action="<?= basename(__FILE__) ?>" method="GET">
        <label for="url">URL:</label>
        <input id="url" name="url" type="text" value="https://" minlength="8" required autofocus />
        <button id="btn" type="submit">Generate PDF</button>

        <script type="text/javascript">
            function urlhandler(e) {
                // URL Value must begin with https://
                if (url.value.trim() == "") {
                    url.value = "https://" + url.value;
                }

                // Prevent removal of https://
                if (e.keyCode == 8 && url.value == "https://") {
                    e.preventDefault();
                }

                // Prevent Delete key
                if (e.keyCode == 46) {
                    e.preventDefault();
                }

                // Add https:// if it was removed during Paste operation
                if (url.value.substr(0, 8).toLowerCase() !== "https://") {
                    url.value = "https://" + url.value;
                }
            }

            function btnhandler(e) {
                if (url.value.substr(0, 8).toLowerCase() !== "https://") {
                    url.value = "https://" + url.value;
                }

                // Prevent submission of the form
                e.preventDefault();

                // Make sure we've provided a URL
                if (8 >= url.value.trim().length ||
                    url.value.trim()[9] == ".") {
                    alert("You must provide a URL.");
                    return;
                }
                
                // Automatically guess top-level domain
                if (url.value.trim().substr(-4, 1) !== "." &&
                    url.value.trim().substr(-3, 1) !== ".") {
                    url.value += ".com";
                }

                url.parentNode.submit();
            }
            
            // Event listeners
            url.addEventListener("keydown", function(e) {
                urlhandler(e);
            });
            
            url.addEventListener("onpaste", function(e) {
                urlhandler(e);
            });
            
            btn.addEventListener("click", function(e) {
               btnhandler(e);
            });
        </script>
    </form>
</body>
<?php
}
?>

Code (Over-engineered)

<?php

/**
 * --- DO NOT REMOVE THIS DOCBLOCK ---
 * @WebCrawlTrackingId fcc5094e.ccc3a1df.5eb4dbfa.6c3772e1
 */

/**
 * @file
 * Code for ni_wkhtmltopdf function.
 * Includes a demonstration at the end.
 */

/**
 * Function that returns a PDF file
 * from a URL using wkhtmltopdf.
 *
 * @author t3ap0t@stackoverflow.com
 *
 * @param string $url
 *     URL to convert
 *
 * @param string $https
 *     Ensures we're giving it HTTPS
 *
 * @param string $download
 *     Decide whether to download the
 *     file by specifying a filename
 *     or don't specify anything to
 *     display it in the browsers
 *     built-in PDF viewer.
 *
 * @param string $checkcmd
 *     Ensure we have all commands
 *     required to fulfil the operation.
 *
 *     * On Windows hosts these commands 
 *     can be acquired on using `scoop`.
 *
 * @param string $checkos
 *     Make sure we're running Linux.
 *
 *     * Optional if we have both commands
 *     available on a Windows host.
 *
 *
 * @return int|file
 *     Return (int) -1 if URL is empty
 *     Return (int) -2 if URL is not a string
 *     Return (int) -3 if URL is not a URL
 *     Return (int) -4 if protocol is not HTTPS
 *     Return (int) -5 if OS is not Linux
 *     Return (int) -6 if command wkhtmltopdf not found
 *     Return (int) -7 if command cat not found
 *     Return (int) -8 wkhtmltopdf returned nothing
 */
function ni_wkhtmltopdf($url = "", $https = false, $download = false, $checkcmd = true, $checkos = false) {
    // URL can't be empty
    if ($url == "") {
        return -1;
    }

    // URL must be a string
    if (gettype($url) !== "string") {
        return -2;
    }

    // Remove whitespace
    $url = trim($url);

    // Explode URL by ':' to Array
    $urla = explode(":", $url, 2);

    // URL must be an actual URL
    if (strtolower(substr($urla[0], 0, 4)) !== "http" || substr($urla[1], 0, 2) !== "//") {
        return -3;
    }

    // Optional: Make sure the URL is HTTPS (Secure)
    if ($https && strtolower(substr($url, 0, 8)) !== "https://") {
        return -4;
    }

    // Optional: Check operating system
    if ($checkos && strtolower(PHP_OS) !== "linux") {
        return -5;
    }

    // Optional: (Linux) Make sure the `wkhtmltopdf` command exists
    if ($checkcmd && !(`which wkhtmltopdf` > 0)) {
        return -6;
    }

    // Optional: (Linux) Make sure the `cat` command exists
    if ($checkcmd && !(`which cat` > 0)) {
        return -7;
    }

    // Clear URL to (hopefully) prevent RCE
    $rep = array(
        " "      => "%20",
        "%20%20" => "",
        "`"      => "%60",
        ";"      => "%3B",
        ":"      => "%3A",
        ">"      => "%3E",
        "<"      => "%3C",
        "["      => "%5B",
        "]"      => "%5D",
        "{"      => "%7B",
        "}"      => "%7D",
        "("      => "%28",
        ")"      => "%29",
        "|"      => "%7C",
        "$"      => "%24",
        "&&"     => "%26%26",
        '"'      => "%22",
        "\"     => "%5C"
    );

    // Replace $a with $b inside URL
    foreach ($rep as $a => $b) {
        $url = str_replace($a, $b, $url);
    }

    unset($rep);

    // Generate a PDF file
    exec("wkhtmltopdf "$url" - | cat", $buffer);

    $buffer = implode("n", $buffer);

    $buffsz = strlen($buffer);

    // Is buffer empty?
    if (0 >= $buffsz) {
        return -8;
    }

    // Prepare headers
    header("Content-Type:application/pdf");

    if ($download) {
        $download = trim($download);
        header("Content-Disposition:attachment;filename="$download"");
    } else {
        header("Content-Disposition:inline");
    }

    header("Content-Length:" . $buffsz);

    exit($buffer);
}

// Demonstrate ni_wkhtmltopdf

// Are we getting the URL parameter?
if (isset($_GET["url"])) {
    // Convert array to string
    if (is_array($_GET["url"])) {
        $_GET["url"] = $_GET["url"][0];
    }
    
    // Remove whitespace
    $url = trim($_GET["url"]);

    // URL is empty so unset it
    if ($url == "") {
        unset($_GET["url"], $url);
        header("Location:" . basename(__FILE__));
    }

    // Get PDF output
    if (isset($url)) {
        ni_wkhtmltopdf($url);
    }
} else {
?>
<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width">
    <title>PHP wkhtmltopdf Demo (t3ap0t@stackoverflow.com)</title>
    <style>
        *{outline:0}
        html,body{
            zoom:1.25
        }
    </style>
</head>
<body>
    <form action="<?= basename(__FILE__) ?>" method="GET">
        <label for="url">URL:</label>
        <input id="url" name="url" type="text" value="https://" minlength="8" required autofocus />
        <button id="btn" type="submit">Generate PDF</button>

        <script type="text/javascript">
            function urlhandler(e) {
                // URL Value must begin with https://
                if (url.value.trim() == "") {
                    url.value = "https://" + url.value;
                }

                // Prevent removal of https://
                if (e.keyCode == 8 && url.value == "https://") {
                    e.preventDefault();
                }

                // Prevent Delete key
                if (e.keyCode == 46) {
                    e.preventDefault();
                }

                // Add https:// if it was removed during Paste operation
                if (url.value.substr(0, 8).toLowerCase() !== "https://") {
                    url.value = "https://" + url.value;
                }
            }

            function btnhandler(e) {
                if (url.value.substr(0, 8).toLowerCase() !== "https://") {
                    url.value = "https://" + url.value;
                }

                // Prevent submission of the form
                e.preventDefault();

                // Make sure we've provided a URL
                if (8 >= url.value.trim().length ||
                    url.value.trim()[9] == ".") {
                    alert("You must provide a URL.");
                    return;
                }
                
                // Automatically guess top-level domain
                if (url.value.trim().substr(-4, 1) !== "." &&
                    url.value.trim().substr(-3, 1) !== ".") {
                    url.value += ".com";
                }

                url.parentNode.submit();
            }
            
            // Event listeners
            url.addEventListener("keydown", function(e) {
                urlhandler(e);
            });
            
            url.addEventListener("onpaste", function(e) {
                urlhandler(e);
            });
            
            btn.addEventListener("click", function(e) {
               btnhandler(e);
            });
        </script>
    </form>
</body>
<?php
}
?>
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement