I am creating a file using php fwrite() and I know all my data is in UTF8 ( I have done extensive testing on this – when saving data to db and outputting on normal webpage all work fine and report as utf8.), but I am being told the file I am outputting contains non utf8 data 🙁 Is there a command in bash (CentOS) to check the format of a file?
When using vim it shows the content as:
Donâ~@~Yt do anything …. Itâ~@~Ys a great site with everything….Weâ~@~Yve only just launched/
Any help would be appreciated: Either confirming the file is UTF8 or how to write utf8 content to a file.
UPDATE
To clarify how I know I have data in UTF8 i have done the following:
- DB is set to utf8 When saving data
to database I run this first:
$enc = mb_detect_encoding($data);
$data = mb_convert_encoding($data, "UTF-8", $enc);
Just before I run fwrite i have checked the data with Note each piece of data returns ‘IS utf-8’
if (strlen($data)==mb_strlen($data, 'UTF-8')) print 'NOT UTF-8'; else print 'IS utf-8';
Thanks!
Advertisement
Answer
The only thing I had to do is add a UTF8 BOM to the CSV, the data was correct but the file reader (external application) couldn’t read the file properly without the BOM