i am getting some strange characters form ldap server when i search some user info.if value contains turkish characters like ‘ç’ it replaces to ‘�’.in this situatian i convert string to utf-8 than str_replace to fix it.My function is that;
function utf8char($str) { $search = array('Ý','ý', 'þ' ,'Þ' ,'ð','Ð'); $replace = array('İ' ,'ı' ,'ş','Ş','ğ','Ğ'); return str_replace($search, $replace, $str); }
But sometimes that causes some problem , so i have to detect if string contains ‘�’ character to fix it.strpos does not work.Can anyone say something about this? And what is this shit ‘�’ character , i would be happy if anyone can explain…
Edit: Here is my code snippet;
$name = $ldapHandler->get_user_info('username')['name']; echo $name; echo utf8_decode($name); echo mb_convert_encoding($name,'utf-8'); echo utf8char(mb_convert_encoding($name,'utf-8'));
and output of this code;
Bilgi ��lem Daire Ba�kanl��� Bilgi ?lem Daire Ba?kanl?? Bilgi Ýþlem Daire Baþkanlýðý Bilgi İşlem Daire Başkanlığı (this is the correct string)
Advertisement
Answer
It has been a long time but i decided to share my solution who faced with the same problem.
This function worked for me:
function repair($value) { $res = @iconv("UTF-8", "UTF-8//IGNORE", $value); if (strlen($value) != strlen($res)) { return w1250_to_utf8($value); } return $res; } function w1250_to_utf8($text) { // map based on: // http://konfiguracja.c0.pl/iso02vscp1250en.html // http://konfiguracja.c0.pl/webpl/index_en.html#examp // http://www.htmlentities.com/html/entities/ $map = array( chr(0x8A) => chr(0xA9), chr(0x8C) => chr(0xA6), chr(0x8D) => chr(0xAB), chr(0x8E) => chr(0xAE), chr(0x8F) => chr(0xAC), chr(0x9C) => chr(0xB6), chr(0x9D) => chr(0xBB), chr(0xA1) => chr(0xB7), chr(0xA5) => chr(0xA1), chr(0xBC) => chr(0xA5), chr(0x9F) => chr(0xBC), chr(0xB9) => chr(0xB1), chr(0x9A) => chr(0xB9), chr(0xBE) => chr(0xB5), chr(0x9E) => chr(0xBE), chr(0x80) => '€', chr(0x82) => '‚', chr(0x84) => '„', chr(0x85) => '…', chr(0x86) => '†', chr(0x87) => '‡', chr(0x89) => '‰', chr(0x8B) => '‹', chr(0x91) => '‘', chr(0x92) => '’', chr(0x93) => '“', chr(0x94) => '”', chr(0x95) => '•', chr(0x96) => '–', chr(0x97) => '—', chr(0x99) => '™', chr(0x9B) => '’', chr(0xA6) => '¦', chr(0xA9) => '©', chr(0xAB) => '«', chr(0xAE) => '®', chr(0xB1) => '±', chr(0xB5) => 'µ', chr(0xB6) => '¶', chr(0xB7) => '·', chr(0xBB) => '»', ); $search = array('Ý', 'ý', 'þ', 'Þ', 'ð', 'Ð'); $replace = array('İ', 'ı', 'ş', 'Ş', 'ğ', 'Ğ'); mb_internal_encoding("ISO-8859-1"); return str_replace($search, $replace, html_entity_decode(mb_convert_encoding(strtr($text, $map), 'UTF-8'), ENT_QUOTES, 'UTF-8')); }