Skip to content
Advertisement

Tag: utf-8

How to handle user input of invalid UTF-8 characters

I’m looking for a general strategy/advice on how to handle invalid UTF-8 input from users. Even though my web application uses UTF-8, somehow some users enter invalid characters. This causes errors in PHP’s json_encode() and overall seems like a bad idea to have around. W3C I18N FAQ: Multilingual Forms says “If non-UTF-8 data is received, an error message should be

How to iterate UTF-8 string in PHP?

How to iterate a UTF-8 string character by character using indexing? When you access a UTF-8 string with the bracket operator $str[0] the utf-encoded character consists of 2 or more elements. For example: but I would like to have: It is possible with mb_substr but this is extremely slow, ie. Is there another way to interate the string character by

Detect file encoding in PHP

I have a script which combines a number of files into one, and it breaks when one of the files has UTF8 encoding. I figure that I should be using the utf8_decode() function when reading the files, but I don’t know how to tell which need decoding. My code is basically: Currently, at the start of a UTF8 file, it

Advertisement