Skip to content
Advertisement

Cut an arabic string

I have a string in the arabic language like:

JavaScript

Now I need to cut this string and output it like:

JavaScript

I tried this function:

JavaScript

The problem is that sometimes it displays a symbol like this at the end of the string:

JavaScript

Why does this happen?

Advertisement

Answer

The symbol displayed after the cut is the result of substr() cutting in the middle of a character, resulting in an invalid character.

You need to use Multibyte String Functions to handle arabic strings, such as mb_strlen() and mb_substr().

You also need to make sure the internal encoding for those functions is set to UTF-8. You can set this globally at the top of your script:

JavaScript

Which leads to this:

  • strlen('على احمد يوسف') returns 24, the size in octets
  • mb_strlen('على احمد يوسف') returns 13, the size in characters

Note that mb_strlen('على احمد يوسف') would also return 24 if the internal encoding was still set to the default ISO-8859-1.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement