I’m doing regular expression to parse url and find locale on my website. What I did is this code:
<?php $app_conf = require_once __DIR__ . '/../config/app.php'; function extract_lang($avail) { $uri_lang = []; if (preg_match('/^(/)+([a-z]{2})(/+.*)?/', $_SERVER['REQUEST_URI'], $uri_lang)) { if (in_array($uri_lang[2], $avail)) { $_SERVER['REQUEST_URI'] = isset($uri_lang[3]) ? $uri_lang[3] : "/"; $_SERVER['HTTP_LANG'] = $uri_lang[2]; } } } if ($app_conf['extract_from_uri']) { extract_lang($app_conf['locales']); }
It’s working most of the time, but it has bug. If my given url starts with ‘en’ – it thinks its a locale and crashes my application’s logic. Example route that causes bug:
https://m2.test/environmental_projects
I need to somehow update my regular expression and I’m struggling with it, please help me. In locales config I have array
'locales' => ['en', 'ru']
Okay route should look like
https://m2.test/en/environmental_projects
Advertisement
Answer
You could match a single forward slash, capture in the first group 2 chars a-z and then make group 2 optional matching a forward slash and any char except a newline ending with an anchor $
Note that now there are 2 capturing groups instead of 3, and if you change the delimiter to a char other than /
like for example ~
, you don’t have to escape the forward slash.
^/([a-z]{2})(/.*)?$
See a regex demo
For example
if (preg_match('~^/([a-z]{2})(/.*)?$~', $_SERVER['REQUEST_URI'], $uri_lang)) {