I’m doing regular expression to parse url and find locale on my website. What I did is this code:
<?php
$app_conf = require_once __DIR__ . '/../config/app.php';
function extract_lang($avail)
{
$uri_lang = [];
if (preg_match('/^(/)+([a-z]{2})(/+.*)?/', $_SERVER['REQUEST_URI'], $uri_lang)) {
if (in_array($uri_lang[2], $avail)) {
$_SERVER['REQUEST_URI'] = isset($uri_lang[3]) ? $uri_lang[3] : "/";
$_SERVER['HTTP_LANG'] = $uri_lang[2];
}
}
}
if ($app_conf['extract_from_uri']) {
extract_lang($app_conf['locales']);
}
It’s working most of the time, but it has bug. If my given url starts with ‘en’ – it thinks its a locale and crashes my application’s logic. Example route that causes bug:
https://m2.test/environmental_projects
I need to somehow update my regular expression and I’m struggling with it, please help me. In locales config I have array
'locales' => ['en', 'ru']
Okay route should look like
https://m2.test/en/environmental_projects
Advertisement
Answer
You could match a single forward slash, capture in the first group 2 chars a-z and then make group 2 optional matching a forward slash and any char except a newline ending with an anchor $
Note that now there are 2 capturing groups instead of 3, and if you change the delimiter to a char other than / like for example ~, you don’t have to escape the forward slash.
^/([a-z]{2})(/.*)?$
See a regex demo
For example
if (preg_match('~^/([a-z]{2})(/.*)?$~', $_SERVER['REQUEST_URI'], $uri_lang)) {