Skip to content
Advertisement

PHP find the charset set by setlocale

All my websites use UTF-8 as charset. But there are some issues when setting setlocale and getting localized strings from strftime for month names and weekday names.

  1. issue: Localizing month names do not work on the server.
  2. issue: How to know if the locale stings are in UTF-8 or ISO?

Concrete: I set the locale like this:

$locales = ['de_DE.utf-8/utf-8', 'de_DE@euro/utf-8', 'de_DE', 'de-DE', 'german', 'de', 'ge'];
$locale = setlocale(LC_ALL, $locales);

On my development system (Windows 10, XAMPP) it finds and sets $locale = ‘de-DE’.
On my server (Linux, Apache) it finds and sets $locale = ‘de_DE’.

Because the PHP built in DateTime class doesn’t support localized names, I created a class which extends the DateTime class:

class DateTimeExt extends DateTime {

    private function isUTF8() : bool {
        $locale = setlocale(LC_ALL, null);
        if (!$locale) return false;
        $locale = strtoupper($locale);
        return strpos($locale, 'UTF8') !== false || strpos($locale, 'UTF-8') !== false;
    }

    public function weekdayName() : string {
        $weekday = strftime('%A', $this->getTimestamp());
        if (!$weekday) return 'unknown';
        return $this->isUTF8() ? $weekday : utf8_encode($weekday);
    }

    public function monthName() : string {
        $month = strftime('%B', $this->getTimestamp());
        if (!$month) return 'unknown';
        return $this->isUTF8() ? $month : utf8_encode($month);
    }
}

The test with:

$date = new DateTimeExt('2021-03-02');
echo $date->weekdayName().'<br />';
echo $date->monthName().'<br />';

results on the dev environment:

// Dienstag
// März

Both are correct. Without the UTF-8 encoding it would give back:

// Dienstag
// M�rz

But on the server it results to:

// Dienstag
// March

WTF??? The server can localize the weekday, but not the month, and defaults to english? Why that? This is issue No.1.

Issue No.2 is the detection whether the set locale supports UTF-8. My function isUTF8 just gets the current locale and searches for the patterns 'UTF8' and 'UTF-8' in it, and if yes I assume UTF-8, otherwise i assume ISO and a needed UTF-8 encoding. I don’t think this is the smartest thinkable way to do this and maybe quite error-prone. Is there a better way?

Advertisement

Answer

Locales are system-dependent. Make sure that the locales you want show up in locale -a.

Mucking about with setLocale() is global and can have unintended side-effects, I would suggest IntlDateFormatter instead:

$d = new DateTime('2021-03-15');
$mon_formatter = new IntlDateFormatter('de_DE', IntlDateFormatter::NONE, IntlDateFormatter::NONE);
$mon_formatter->setPattern('MMMM');

$day_formatter = new IntlDateFormatter('de_DE', IntlDateFormatter::NONE, IntlDateFormatter::NONE);
$day_formatter->setPattern('EEEE');

var_dump(
    $mon_formatter->format($d),
    $day_formatter->format($d)
);

Output

string(5) "März"
string(6) "Montag"

Or, more likely, format the entire date using a pattern you define.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement