fgetcsv and character encoding

There’s a lot of info on the net about solving problems parsing csvs: using iconv, utf8_encode and utf8_decode, using fget instead of fgetcsv and parsing your file as a string, even some regex stuff for finding and replacing weird characters.

What worked for me wasn’t any of those things. I am migrating content from a Joomla website (using a modified version of TinyMCE when storing certain fields) to a Drupal one, and I kept getting problem  characters appearing as next to some spaces, and other problem characters with quotes and inverted commas.

All I had to do was open the CSV with the exported content from the Joomla tables in notepad++ and convert it to ANSI encoding (in the menu under ‘Encoding’ I clicked ‘Convert encoding to ANSI’). Let’s just hope that non-utf8 encoding isn’t going to come back and bite me in the bum later.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>