Skip to content
Advertisement

PHP Unicode to character conversion

I receive country names like from a library: "u00c3u0096sterreich".

How do I convert this to Österreich?

Using PHP 7.3

Advertisement

Answer

This one is a lot trickier than it seem, but the below code appears to work.

First we pipe it through the standard regex for Unicode escape sequences, then pack that as a binary string, convert the encoding and finally decode. I cannot promise this is the best way to do this, but it appears to be working correct as far as I can tell.

$str = 'u00c3u0096sterreich';

$str = preg_replace_callback('/\\u([0-9a-fA-F]{4})/', function ($match) {
    return utf8_decode(mb_convert_encoding(pack('H*', $match[1]), 'UTF-8', 'UCS-2BE'));
}, $str);

Demo here

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement