Skip to content
Advertisement

Detect file encoding in PHP

I have a script which combines a number of files into one, and it breaks when one of the files has UTF8 encoding. I figure that I should be using the utf8_decode() function when reading the files, but I don’t know how to tell which need decoding.

My code is basically:

$output = '';
foreach ($files as $filename) {
    $output .= file_get_contents($filename) . "n";
}
file_put_contents('combined.txt', $output);

Currently, at the start of a UTF8 file, it adds these characters in the output: 

Advertisement

Answer

Try using the mb_detect_encoding function. This function will examine your string and attempt to “guess” what its encoding is. You can then convert it as desired. As brulak suggested, however, you’re probably better off converting to UTF-8 rather than from, to preserve the data you’re transmitting.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement