Tag: utf-8

How to handle user input of invalid UTF-8 characters

I’m looking for a general strategy/advice on how to handle invalid UTF-8 input from users. Even though my web application uses UTF-8, somehow some users enter invalid characters. This causes errors in PHP’s json_encode() and overall seems like a bad idea to have around. W3C I18N FAQ: Multilingual …

How to iterate UTF-8 string in PHP?

php utf-8

How to iterate a UTF-8 string character by character using indexing? When you access a UTF-8 string with the bracket operator $str[0] the utf-encoded character consists of 2 or more elements. For example: but I would like to have: It is possible with mb_substr but this is extremely slow, ie. Is there another …

Dealing with eacute and other special characters using Oracle, PHP and Oci8

character-encoding oci8 oracle php utf-8

Hi I am trying to store names into an Oracle database and fetch them back using PHP and oci8. However, if I insert the Ã© directly into the Oracle database and use oci8 to fetch it back I just receive an e Do I have to encode all special characters (including Ã©) into html entities (ie: é) before i…

Detect file encoding in PHP

character-encoding php utf-8

I have a script which combines a number of files into one, and it breaks when one of the files has UTF8 encoding. I figure that I should be using the utf8_decode() function when reading the files, but I don’t know how to tell which need decoding. My code is basically: Currently, at the start of a UTF8 f…

Cyrillic characters in PHP’s json_encode

json php utf-8

I’m trying to encode Cyrillic UTF-8 array to JSON string using php’s function json_encode. The sample code looks like this: