I want to encode normal characters to html-entities like
a => a A => A b => b B => B
but
echo htmlentities("a");
doesn’t work. It outputs the normal charaters (a A b B) in the html source code instead of the html-entities.
How can I convert them?
Advertisement
Answer
You can build a function for this fairly easily using mb_ord or IntlChar::ord, either of which will give you the numeric value for a Unicode Code Point.
You can then convert that to a hexadecimal string using base_convert, and add the ‘&#x’ and ‘;’ around it to give an HTML entity:
function make_entity(string $char) { $codePoint = mb_ord($char, 'UTF-8'); // or IntlChar::ord($char); $hex = base_convert($codePoint, 10, 16); return '&#x' . $hex . ';'; } echo make_entity('a'); echo make_entity('€'); echo make_entity('ð');
You then need to run that for each code point in your UTF-8 string. It is not enough to loop over the string using something like substr
, because PHP’s string functions work with individual bytes, and each UTF-8 code point may be multiple bytes.
One approach would be to use a regular expression replacement with a pattern of /./u
:
- The
.
matches each single “character” - The
/u
modifier turns on Unicode mode, so that each “character” matched by the.
is a whole code point
You can then run the above make_entity
function for each match (i.e. each code point) with preg_replace_callback.
Since preg_replace_callback will pass your callback an array of matches, not just a string, you can make an arrow function which takes the array and passes element 0 to the real function:
$callback = fn($matches) => make_entity($matches[0]);
So putting it together, you have this:
echo preg_replace_callback('/./u', fn($m) => make_entity($m[0]), 'a€ð');
Arrow functions were introduced in PHP 7.4, so if you’re stuck on an older version, you can write the same thing as a regular anonymous function:
echo preg_replace_callback('/./u', function($m) { return make_entity($m[0]) }, 'a€ð');
Or of course, just a regular named function (or a method on a class or object; see the “callable” page in the manual for the different syntax options):
function make_entity_from_array_item(array $matches) { return make_entity($matches[0]); } echo preg_replace_callback('/./u', 'make_entity_from_array_item', 'a€ð');