<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Correcting Corrupted Characters</title>
	<atom:link href="http://meyerweb.com/eric/thoughts/2009/11/19/correcting-corrupted-characters/feed/" rel="self" type="application/rss+xml" />
	<link>http://meyerweb.com/eric/thoughts/2009/11/19/correcting-corrupted-characters/</link>
	<description>Things that Eric A. Meyer, CSS expert, writes about on his personal Web site; it&#039;s largely Web standards and Web technology, but also various bits of culture, politics, personal observations, and other miscellaneous stuff</description>
	<lastBuildDate>Tue, 18 Jun 2013 15:30:40 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
	<item>
		<title>By: Find, Search, Replace, and Delete In A WordPress Database - WordCast</title>
		<link>http://meyerweb.com/eric/thoughts/2009/11/19/correcting-corrupted-characters/#comment-518310</link>
		<dc:creator>Find, Search, Replace, and Delete In A WordPress Database - WordCast</dc:creator>
		<pubDate>Sun, 28 Nov 2010 22:58:40 +0000</pubDate>
		<guid isPermaLink="false">http://meyerweb.com/eric/thoughts/?p=1214#comment-518310</guid>
		<description><![CDATA[[...] Eric&#8217;s Archived Thoughts: Correcting Corrupted Characters in WordPress [...]]]></description>
		<content:encoded><![CDATA[<p>[...] Eric&#8217;s Archived Thoughts: Correcting Corrupted Characters in WordPress [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Eran Galperin</title>
		<link>http://meyerweb.com/eric/thoughts/2009/11/19/correcting-corrupted-characters/#comment-491729</link>
		<dc:creator>Eran Galperin</dc:creator>
		<pubDate>Thu, 04 Feb 2010 17:54:15 +0000</pubDate>
		<guid isPermaLink="false">http://meyerweb.com/eric/thoughts/?p=1214#comment-491729</guid>
		<description><![CDATA[I&#039;m not sure if this is still relevant, but since I didn&#039;t see any mention of this in the other comments, and going by character set details you posted, the issue is probably in the connection character set / collation. 

It&#039;s a common issue that MySQL selects an inappropriate connection collation, regardless of the headers in the HTTP request (those are irrelevant, since it is the PHP script that connects to the database). You can either force the connection to UTF8 in the MySQL configuration, or issue two queries on every queries that set the connection to UTF.

Those would be:
SET CHARACTER SET UTF8;
SET NAMES UTF8;

You can read on those on the MySQL docs -
http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html]]></description>
		<content:encoded><![CDATA[<p>I&#8217;m not sure if this is still relevant, but since I didn&#8217;t see any mention of this in the other comments, and going by character set details you posted, the issue is probably in the connection character set / collation. </p>
<p>It&#8217;s a common issue that MySQL selects an inappropriate connection collation, regardless of the headers in the HTTP request (those are irrelevant, since it is the PHP script that connects to the database). You can either force the connection to UTF8 in the MySQL configuration, or issue two queries on every queries that set the connection to UTF.</p>
<p>Those would be:<br />
SET CHARACTER SET UTF8;<br />
SET NAMES UTF8;</p>
<p>You can read on those on the MySQL docs -<br />
<a href="http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html" rel="nofollow">http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Johan Sand</title>
		<link>http://meyerweb.com/eric/thoughts/2009/11/19/correcting-corrupted-characters/#comment-489372</link>
		<dc:creator>Johan Sand</dc:creator>
		<pubDate>Thu, 07 Jan 2010 22:34:19 +0000</pubDate>
		<guid isPermaLink="false">http://meyerweb.com/eric/thoughts/?p=1214#comment-489372</guid>
		<description><![CDATA[and for Friday Fun - if you want to update the entire database including all potentially affected records in all relevant fields in all tables, then this would be a crazy kenobi option.

This time only amend db host, user, pass and name.

Again, upload to site and run through firefox as is.

ps. make sure there&#039;s enough execution time for php to wrap it up.

&lt;code&gt;
&lt;!DOCTYPE html PUBLIC &quot;-//W3C//DTD XHTML 1.0 Strict//EN&quot; &quot;http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd&quot;&gt;
&lt;html xmlns=&quot;http://www.w3.org/1999/xhtml&quot; xml:lang=&quot;en&quot; lang=&quot;en&quot;&gt;
&lt;head&gt;
	&lt;meta http-equiv=&quot;Content-Type&quot; content=&quot;text/html;charset=utf-8&quot; /&gt;
	&lt;meta name=&quot;uid&quot; content=&quot;10&quot; /&gt;
&lt;/head&gt;
&lt;body&gt;

&lt;?php

$db_host = &#039;host&#039;;
$db_user = &#039;user&#039;;
$db_pass = &#039;pass&#039;;
$db_name = &#039;name&#039;;

$DB = new mysqli($db_host, $db_user, $db_pass, $db_name);

$field_types = array(&#039;varchar&#039;,&#039;text&#039;,&#039;tinytext&#039;,&#039;longtext&#039;);

if ($res_tables = $DB-&gt;query (&quot;SHOW TABLES&quot;)) {
    while ($tables = $res_tables-&gt;fetch_array(MYSQLI_NUM) ) {

        if ($res_fields = $DB-&gt;query (&quot;SHOW COLUMNS FROM &quot;.$tables[0])) {

            if ($res_key = $DB-&gt;query (&quot;SHOW COLUMNS FROM &quot;.$tables[0].&quot; WHERE `Key` LIKE &#039;PRI&#039;&quot;)) {
                $key = $res_key-&gt;fetch_assoc();
                $unique_key = $key[&#039;Field&#039;];
            }

            while ($fields = $res_fields-&gt;fetch_array(MYSQLI_ASSOC) ) {
                if (in_array($fields[&#039;Type&#039;], $field_types)) {

                    $DB-&gt;query(&quot;SET NAMES latin1&quot;);

                    if ($res = $DB-&gt;query (&quot;SELECT &quot;.$unique_key.&quot;, &quot;.$fields[&#039;Field&#039;].&quot; FROM &quot;.$tables[0].&quot; WHERE 1=1&quot;)) {
                        while ($data = $res-&gt;fetch_object() ) {

                            $DB-&gt;query(&quot;SET NAMES utf8;&quot;);

                            $unique_field = $data-&gt;$unique_key;
                            $fix_field = bin2hex($data-&gt;$fields[&#039;Field&#039;]);

                            $result = $DB-&gt;query (&quot;
                                UPDATE &quot;.$tables[0].&quot;
                                SET &quot;.$fields[&#039;Field&#039;].&quot; = UNHEX(&#039;&quot;.$DB-&gt;real_escape_string($fix_field).&quot;&#039;)
                                WHERE &quot;.$unique_key.&quot; = &#039;&quot;.$unique_field.&quot;&#039;
                            &quot;);

                            unset($unique_field);
                            unset($fix_field);

                        }
                    }

                }
            }

        }
        unset($key);
        unset($unique_key);

    }
}

?&gt;
&lt;/code&gt;]]></description>
		<content:encoded><![CDATA[<p>and for Friday Fun &#8211; if you want to update the entire database including all potentially affected records in all relevant fields in all tables, then this would be a crazy kenobi option.</p>
<p>This time only amend db host, user, pass and name.</p>
<p>Again, upload to site and run through firefox as is.</p>
<p>ps. make sure there&#8217;s enough execution time for php to wrap it up.</p>
<p><code><br />
&lt;!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt;<br />
&lt;html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"&gt;<br />
&lt;head&gt;<br />
	&lt;meta http-equiv="Content-Type" content="text/html;charset=utf-8" /&gt;<br />
	&lt;meta name="uid" content="10" /&gt;<br />
&lt;/head&gt;<br />
&lt;body&gt;</p>
<p>&lt;?php</p>
<p>$db_host = 'host';<br />
$db_user = 'user';<br />
$db_pass = 'pass';<br />
$db_name = 'name';</p>
<p>$DB = new mysqli($db_host, $db_user, $db_pass, $db_name);</p>
<p>$field_types = array('varchar','text','tinytext','longtext');</p>
<p>if ($res_tables = $DB-&gt;query ("SHOW TABLES")) {<br />
    while ($tables = $res_tables-&gt;fetch_array(MYSQLI_NUM) ) {</p>
<p>        if ($res_fields = $DB-&gt;query ("SHOW COLUMNS FROM ".$tables[0])) {</p>
<p>            if ($res_key = $DB-&gt;query ("SHOW COLUMNS FROM ".$tables[0]." WHERE `Key` LIKE 'PRI'")) {<br />
                $key = $res_key-&gt;fetch_assoc();<br />
                $unique_key = $key['Field'];<br />
            }</p>
<p>            while ($fields = $res_fields-&gt;fetch_array(MYSQLI_ASSOC) ) {<br />
                if (in_array($fields['Type'], $field_types)) {</p>
<p>                    $DB-&gt;query("SET NAMES latin1");</p>
<p>                    if ($res = $DB-&gt;query ("SELECT ".$unique_key.", ".$fields['Field']." FROM ".$tables[0]." WHERE 1=1")) {<br />
                        while ($data = $res-&gt;fetch_object() ) {</p>
<p>                            $DB-&gt;query("SET NAMES utf8;");</p>
<p>                            $unique_field = $data-&gt;$unique_key;<br />
                            $fix_field = bin2hex($data-&gt;$fields['Field']);</p>
<p>                            $result = $DB-&gt;query ("<br />
                                UPDATE ".$tables[0]."<br />
                                SET ".$fields['Field']." = UNHEX('".$DB-&gt;real_escape_string($fix_field)."')<br />
                                WHERE ".$unique_key." = '".$unique_field."'<br />
                            ");</p>
<p>                            unset($unique_field);<br />
                            unset($fix_field);</p>
<p>                        }<br />
                    }</p>
<p>                }<br />
            }</p>
<p>        }<br />
        unset($key);<br />
        unset($unique_key);</p>
<p>    }<br />
}</p>
<p>?&gt;<br />
</code></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Johan Sand</title>
		<link>http://meyerweb.com/eric/thoughts/2009/11/19/correcting-corrupted-characters/#comment-489324</link>
		<dc:creator>Johan Sand</dc:creator>
		<pubDate>Thu, 07 Jan 2010 08:11:30 +0000</pubDate>
		<guid isPermaLink="false">http://meyerweb.com/eric/thoughts/?p=1214#comment-489324</guid>
		<description><![CDATA[To make a very long story short - this is what you need to do:

- Amend DB specific entries (host, user, pass, db, fields and table).
- Don&#039;t fiddle with the rest of the code.
- Upload to browsable part of your web server/site.
- Call the &quot;page&quot; from firefox.
- Wait (and don&#039;t reload) until complete.

If you need any of the code explained, feel free to drop me an email.

hth, cheers.
/j.

ps. the code tag strips brackets, which is a bit annoying (converted to lt&#124;gt)...

&lt;code&gt;
&lt;!DOCTYPE html PUBLIC &quot;-//W3C//DTD XHTML 1.0 Strict//EN&quot; &quot;http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd&quot;&gt;
&lt;html xmlns=&quot;http://www.w3.org/1999/xhtml&quot; xml:lang=&quot;en&quot; lang=&quot;en&quot;&gt;
&lt;head&gt;
	&lt;meta http-equiv=&quot;Content-Type&quot; content=&quot;text/html;charset=utf-8&quot; /&gt;
	&lt;meta name=&quot;uid&quot; content=&quot;10&quot; /&gt;
&lt;/head&gt;
&lt;body&gt;

&lt;?php

$DB = new mysqli(&#039;host&#039;,&#039;user&#039;,&#039;pass&#039;,&#039;db&#039;);
$DB-&gt;query(&quot;SET NAMES latin1&quot;);

if ($res = $DB-&gt;query (&quot;SELECT unique_field, fix_field_1, fix_field_2, fix_field_3, fix_field_4 FROM fix_table WHERE 1=1&quot;)) {

  echo &#039;rows: &#039;.$res-&gt;num_rows;
  $cnt = 0;

  while ($data = $res-&gt;fetch_object() ) {

    $DB-&gt;query(&quot;SET NAMES utf8;&quot;);

    $unique_field = $data-&gt;unique_field;
    $fix_field_1 = bin2hex($data-&gt;fix_field_1);
    $fix_field_2 = bin2hex($data-&gt;fix_field_2);
    $fix_field_3 = bin2hex($data-&gt;fix_field_3);
    $fix_field_4 = bin2hex($data-&gt;fix_field_4);

    $result = $DB-&gt;query (&quot;
        UPDATE fix_table
        SET
            fix_field_1 = UNHEX(&#039;&quot;.$DB-&gt;real_escape_string($fix_field_1).&quot;&#039;),
            fix_field_2 = UNHEX(&#039;&quot;.$DB-&gt;real_escape_string($fix_field_2).&quot;&#039;),
            fix_field_3 = UNHEX(&#039;&quot;.$DB-&gt;real_escape_string($fix_field_3).&quot;&#039;),
            fix_field_4 = UNHEX(&#039;&quot;.$DB-&gt;real_escape_string($fix_field_4).&quot;&#039;)
            WHERE unique_field = &#039;&quot;.$unique_field.&quot;&#039;&quot;);

    echo $cnt.&quot; - &quot;.$unique_field.&quot;&lt;br /&gt;&quot;;

    unset($unique_field);
    unset($fix_field_1);
    unset($fix_field_2);
    unset($fix_field_3);
    unset($fix_field_4);

    echo $DB-&gt;error;

    $cnt++;
    }
}

?&gt;
&lt;/code&gt;]]></description>
		<content:encoded><![CDATA[<p>To make a very long story short &#8211; this is what you need to do:</p>
<p>- Amend DB specific entries (host, user, pass, db, fields and table).<br />
- Don&#8217;t fiddle with the rest of the code.<br />
- Upload to browsable part of your web server/site.<br />
- Call the &#8220;page&#8221; from firefox.<br />
- Wait (and don&#8217;t reload) until complete.</p>
<p>If you need any of the code explained, feel free to drop me an email.</p>
<p>hth, cheers.<br />
/j.</p>
<p>ps. the code tag strips brackets, which is a bit annoying (converted to lt|gt)&#8230;</p>
<p><code><br />
&lt;!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt;<br />
&lt;html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"&gt;<br />
&lt;head&gt;<br />
	&lt;meta http-equiv="Content-Type" content="text/html;charset=utf-8" /&gt;<br />
	&lt;meta name="uid" content="10" /&gt;<br />
&lt;/head&gt;<br />
&lt;body&gt;</p>
<p>&lt;?php</p>
<p>$DB = new mysqli('host','user','pass','db');<br />
$DB-&gt;query("SET NAMES latin1");</p>
<p>if ($res = $DB-&gt;query ("SELECT unique_field, fix_field_1, fix_field_2, fix_field_3, fix_field_4 FROM fix_table WHERE 1=1")) {</p>
<p>  echo 'rows: '.$res-&gt;num_rows;<br />
  $cnt = 0;</p>
<p>  while ($data = $res-&gt;fetch_object() ) {</p>
<p>    $DB-&gt;query("SET NAMES utf8;");</p>
<p>    $unique_field = $data-&gt;unique_field;<br />
    $fix_field_1 = bin2hex($data-&gt;fix_field_1);<br />
    $fix_field_2 = bin2hex($data-&gt;fix_field_2);<br />
    $fix_field_3 = bin2hex($data-&gt;fix_field_3);<br />
    $fix_field_4 = bin2hex($data-&gt;fix_field_4);</p>
<p>    $result = $DB-&gt;query ("<br />
        UPDATE fix_table<br />
        SET<br />
            fix_field_1 = UNHEX('".$DB-&gt;real_escape_string($fix_field_1)."'),<br />
            fix_field_2 = UNHEX('".$DB-&gt;real_escape_string($fix_field_2)."'),<br />
            fix_field_3 = UNHEX('".$DB-&gt;real_escape_string($fix_field_3)."'),<br />
            fix_field_4 = UNHEX('".$DB-&gt;real_escape_string($fix_field_4)."')<br />
            WHERE unique_field = '".$unique_field."'");</p>
<p>    echo $cnt." - ".$unique_field."&lt;br /&gt;";</p>
<p>    unset($unique_field);<br />
    unset($fix_field_1);<br />
    unset($fix_field_2);<br />
    unset($fix_field_3);<br />
    unset($fix_field_4);</p>
<p>    echo $DB-&gt;error;</p>
<p>    $cnt++;<br />
    }<br />
}</p>
<p>?&gt;<br />
</code></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Josue Rodriguez</title>
		<link>http://meyerweb.com/eric/thoughts/2009/11/19/correcting-corrupted-characters/#comment-489287</link>
		<dc:creator>Josue Rodriguez</dc:creator>
		<pubDate>Wed, 06 Jan 2010 23:29:07 +0000</pubDate>
		<guid isPermaLink="false">http://meyerweb.com/eric/thoughts/?p=1214#comment-489287</guid>
		<description><![CDATA[This powerful but simple perl script is marvelous to convert your MySQL database charsets to UTF8 quick and easy. I use it every time.

&lt;a href=&quot;http://www.pablowe.net/convert_charset&quot; rel=&quot;nofollow&quot;&gt;http://www.pablowe.net/convert_charset&lt;/a&gt;]]></description>
		<content:encoded><![CDATA[<p>This powerful but simple perl script is marvelous to convert your MySQL database charsets to UTF8 quick and easy. I use it every time.</p>
<p><a href="http://www.pablowe.net/convert_charset" rel="nofollow">http://www.pablowe.net/convert_charset</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Andreas Lagerkvist</title>
		<link>http://meyerweb.com/eric/thoughts/2009/11/19/correcting-corrupted-characters/#comment-488804</link>
		<dc:creator>Andreas Lagerkvist</dc:creator>
		<pubDate>Sat, 02 Jan 2010 11:31:18 +0000</pubDate>
		<guid isPermaLink="false">http://meyerweb.com/eric/thoughts/?p=1214#comment-488804</guid>
		<description><![CDATA[I&#039;m not sure if this helps, and I know some people already pointed some of it out, but I recently converted my DB to UTF-8 and this is what I did:

1. mysqldump the whole thing to a file
2. Add a special character (like &quot;Ö&quot;) to said file that looks good in the editor
3. Open the file with Firefox and check which encoding is used when the &quot;Ö&quot; looks ok (to find out exactly what encoding the file is)
4. Run iconv on the file to actually convert it to UTF-8 (from whatever encoding Firefox said it was)
5. Manually convert bad characters to good ones (and change potential encoding=latin1-settings in the sql-file to utf8)
6. Create new database where everything is UTF-8
7. Import the new, clean, utf8 SQL

That worked for me at least and I&#039;ve had problems with encodings as far as I can remember.

I think one important bit I didn&#039;t see in the comments (although it may have been mentioned) is to not only convert the characters but also convert the actual file (which I used iconv for).]]></description>
		<content:encoded><![CDATA[<p>I&#8217;m not sure if this helps, and I know some people already pointed some of it out, but I recently converted my DB to UTF-8 and this is what I did:</p>
<p>1. mysqldump the whole thing to a file<br />
2. Add a special character (like &#8220;Ö&#8221;) to said file that looks good in the editor<br />
3. Open the file with Firefox and check which encoding is used when the &#8220;Ö&#8221; looks ok (to find out exactly what encoding the file is)<br />
4. Run iconv on the file to actually convert it to UTF-8 (from whatever encoding Firefox said it was)<br />
5. Manually convert bad characters to good ones (and change potential encoding=latin1-settings in the sql-file to utf8)<br />
6. Create new database where everything is UTF-8<br />
7. Import the new, clean, utf8 SQL</p>
<p>That worked for me at least and I&#8217;ve had problems with encodings as far as I can remember.</p>
<p>I think one important bit I didn&#8217;t see in the comments (although it may have been mentioned) is to not only convert the characters but also convert the actual file (which I used iconv for).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Emil Björklund</title>
		<link>http://meyerweb.com/eric/thoughts/2009/11/19/correcting-corrupted-characters/#comment-486762</link>
		<dc:creator>Emil Björklund</dc:creator>
		<pubDate>Sat, 12 Dec 2009 10:01:09 +0000</pubDate>
		<guid isPermaLink="false">http://meyerweb.com/eric/thoughts/?p=1214#comment-486762</guid>
		<description><![CDATA[After reading this article + comment thread, I&#039;ve decided that the easiest solution to these pesky characted encoding problems is if I just change my name. 

I was thinking maybe Emil Borkedchar?]]></description>
		<content:encoded><![CDATA[<p>After reading this article + comment thread, I&#8217;ve decided that the easiest solution to these pesky characted encoding problems is if I just change my name. </p>
<p>I was thinking maybe Emil Borkedchar?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mike D.</title>
		<link>http://meyerweb.com/eric/thoughts/2009/11/19/correcting-corrupted-characters/#comment-486262</link>
		<dc:creator>Mike D.</dc:creator>
		<pubDate>Mon, 07 Dec 2009 02:15:30 +0000</pubDate>
		<guid isPermaLink="false">http://meyerweb.com/eric/thoughts/?p=1214#comment-486262</guid>
		<description><![CDATA[&quot;This whole post (and the comments) in my opinion demonstrates quite well why you should not trust your data to a database.&quot;

Funny.

Seriously though, you&#039;re probably already going to do this but please post a follow-up post with an overview of the problem and the eventual solution, when you find it. Going through all of these comments makes me feel like a total N00000000B. This has happened to me in WordPress a couple of times and each time I&#039;ve just done manual search-and-replace for the characters I know about.]]></description>
		<content:encoded><![CDATA[<p>&#8220;This whole post (and the comments) in my opinion demonstrates quite well why you should not trust your data to a database.&#8221;</p>
<p>Funny.</p>
<p>Seriously though, you&#8217;re probably already going to do this but please post a follow-up post with an overview of the problem and the eventual solution, when you find it. Going through all of these comments makes me feel like a total N00000000B. This has happened to me in WordPress a couple of times and each time I&#8217;ve just done manual search-and-replace for the characters I know about.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Aeron Glemann</title>
		<link>http://meyerweb.com/eric/thoughts/2009/11/19/correcting-corrupted-characters/#comment-485985</link>
		<dc:creator>Aeron Glemann</dc:creator>
		<pubDate>Fri, 04 Dec 2009 13:32:09 +0000</pubDate>
		<guid isPermaLink="false">http://meyerweb.com/eric/thoughts/?p=1214#comment-485985</guid>
		<description><![CDATA[I&#039;ve had to deal with this a bunch of times.... what I do - and it&#039;s always worked for me - is 1st do a dump. Then - assuming you&#039;re on Mac or Linux - run from the commandline:

iconv -f latin1 -t utf8 myDump.sql &gt; myDumpUTF8.sql

Reimport....]]></description>
		<content:encoded><![CDATA[<p>I&#8217;ve had to deal with this a bunch of times&#8230;. what I do &#8211; and it&#8217;s always worked for me &#8211; is 1st do a dump. Then &#8211; assuming you&#8217;re on Mac or Linux &#8211; run from the commandline:</p>
<p>iconv -f latin1 -t utf8 myDump.sql &gt; myDumpUTF8.sql</p>
<p>Reimport&#8230;.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matt Sharkey</title>
		<link>http://meyerweb.com/eric/thoughts/2009/11/19/correcting-corrupted-characters/#comment-485743</link>
		<dc:creator>Matt Sharkey</dc:creator>
		<pubDate>Wed, 02 Dec 2009 23:31:17 +0000</pubDate>
		<guid isPermaLink="false">http://meyerweb.com/eric/thoughts/?p=1214#comment-485743</guid>
		<description><![CDATA[Finally solved this problem for myself, using the method described in this post:

http://tlug.dnho.net/?q=node/276

Yes, it&#039;s another MySQL dump &amp; import procedure. Haven&#039;t checked for truncated content, but so far all my em &amp; en dashes look good.]]></description>
		<content:encoded><![CDATA[<p>Finally solved this problem for myself, using the method described in this post:</p>
<p><a href="http://tlug.dnho.net/?q=node/276" rel="nofollow">http://tlug.dnho.net/?q=node/276</a></p>
<p>Yes, it&#8217;s another MySQL dump &amp; import procedure. Haven&#8217;t checked for truncated content, but so far all my em &amp; en dashes look good.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Wade Kwon</title>
		<link>http://meyerweb.com/eric/thoughts/2009/11/19/correcting-corrupted-characters/#comment-485281</link>
		<dc:creator>Wade Kwon</dc:creator>
		<pubDate>Sat, 28 Nov 2009 23:55:55 +0000</pubDate>
		<guid isPermaLink="false">http://meyerweb.com/eric/thoughts/?p=1214#comment-485281</guid>
		<description><![CDATA[Eric: Just quickly commenting to say that I&#039;m having the exact same problem of late, and will read through the comments and any updates from you on a workable solution. Tired of doing find/replace.]]></description>
		<content:encoded><![CDATA[<p>Eric: Just quickly commenting to say that I&#8217;m having the exact same problem of late, and will read through the comments and any updates from you on a workable solution. Tired of doing find/replace.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ash Searle</title>
		<link>http://meyerweb.com/eric/thoughts/2009/11/19/correcting-corrupted-characters/#comment-484814</link>
		<dc:creator>Ash Searle</dc:creator>
		<pubDate>Tue, 24 Nov 2009 17:08:46 +0000</pubDate>
		<guid isPermaLink="false">http://meyerweb.com/eric/thoughts/?p=1214#comment-484814</guid>
		<description><![CDATA[@Eric,

I had to do this last year and blogged about it at the time.  I remember an early draft including instructions &quot;open vim and...&quot; - I quickly realised as soon as you get an editor involved you&#039;re fooked.  Fortunately, MySQL has a command-line tool for doing search-and-replace so you don&#039;t have to worry about editor settings or other random phenomena.  (the instructions are in my  &lt;a href=&quot;http://hexmen.com/blog/2008/07/mysql-latin1-utf8-wordpress-upgrade/&quot; rel=&quot;nofollow&quot;&gt;latin1 to utf8 conversion&lt;/a&gt; post)

BTW.  Using Safari 4.0.4 (the latest) on OS X, the encoding in this article looks fine, but the comments are screwed up.  Forcing the text-encoding to ISO Latin 1 fixes the comments, but borks the names of the commenters (e.g. Tantek Çelik)  I don&#039;t know how far you think you&#039;ve got fixing the issues, but it looks like there&#039;s some way to go...  (Note: using the web inspector / firebug  you can check document.characterSet for the displayed character-set - which is handy when you&#039;re checking you&#039;ve overridden the text-encoding via browser menus.)]]></description>
		<content:encoded><![CDATA[<p>@Eric,</p>
<p>I had to do this last year and blogged about it at the time.  I remember an early draft including instructions &#8220;open vim and&#8230;&#8221; &#8211; I quickly realised as soon as you get an editor involved you&#8217;re fooked.  Fortunately, MySQL has a command-line tool for doing search-and-replace so you don&#8217;t have to worry about editor settings or other random phenomena.  (the instructions are in my  <a href="http://hexmen.com/blog/2008/07/mysql-latin1-utf8-wordpress-upgrade/" rel="nofollow">latin1 to utf8 conversion</a> post)</p>
<p>BTW.  Using Safari 4.0.4 (the latest) on OS X, the encoding in this article looks fine, but the comments are screwed up.  Forcing the text-encoding to ISO Latin 1 fixes the comments, but borks the names of the commenters (e.g. Tantek Çelik)  I don&#8217;t know how far you think you&#8217;ve got fixing the issues, but it looks like there&#8217;s some way to go&#8230;  (Note: using the web inspector / firebug  you can check document.characterSet for the displayed character-set &#8211; which is handy when you&#8217;re checking you&#8217;ve overridden the text-encoding via browser menus.)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jeroen Pulles</title>
		<link>http://meyerweb.com/eric/thoughts/2009/11/19/correcting-corrupted-characters/#comment-484616</link>
		<dc:creator>Jeroen Pulles</dc:creator>
		<pubDate>Sun, 22 Nov 2009 16:49:07 +0000</pubDate>
		<guid isPermaLink="false">http://meyerweb.com/eric/thoughts/?p=1214#comment-484616</guid>
		<description><![CDATA[I had the same or similar problem last year, with a client, where my Wordpress data got encoded to UTF-8 twice. I rolled &lt;a href=&quot;http://www.redslider.net/2009/creode/creode.py.html&quot; rel=&quot;nofollow&quot;&gt;my own script&lt;/a&gt; to &quot;double decode&quot; the binary mess in my SQL dump file back to some sane text with the script that is linked above. Perhaps that can be of any help, if you&#039;re the scripting kind of person.]]></description>
		<content:encoded><![CDATA[<p>I had the same or similar problem last year, with a client, where my WordPress data got encoded to UTF-8 twice. I rolled <a href="http://www.redslider.net/2009/creode/creode.py.html" rel="nofollow">my own script</a> to &#8220;double decode&#8221; the binary mess in my SQL dump file back to some sane text with the script that is linked above. Perhaps that can be of any help, if you&#8217;re the scripting kind of person.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kim Sullivan</title>
		<link>http://meyerweb.com/eric/thoughts/2009/11/19/correcting-corrupted-characters/#comment-484613</link>
		<dc:creator>Kim Sullivan</dc:creator>
		<pubDate>Sun, 22 Nov 2009 15:10:36 +0000</pubDate>
		<guid isPermaLink="false">http://meyerweb.com/eric/thoughts/?p=1214#comment-484613</guid>
		<description><![CDATA[comments: tl;dr, just that I&#039;ve had exactly zero succes with dumping the database (via phpMyAdmin) and reimporting.

I&#039;ve ran across this many times, the problem is that MySQL is encoding aware, and once you get bogus data in the database, no amount of recoding between different encodings or setting &quot;set names&quot; will help (in fact, the worst thing you can do is try to repair it by simply setting the correct encoding for your tables - if the target encoding doesn&#039;t have a glyph, &lt;em&gt;it gets irreversibly replaced by ?&lt;/em&gt;).

There&#039;s a simple workaround that worked for me many times.

First, change the database column type  from TEXT (or varchar) to BLOB (or VARBINARY). This makes MySQL &quot;forget&quot; about any encoding it thinks the data is in, and prevents any recoding of the data that goes around behind the scenes (in my case, the data was often encoded in CP1250 or UTF-8, but column encoding was set to LATIN1).

Then you have to find out in what encoding the data is in, and reset the column type to text/varchar/char with the encoding that matches the physical encoding of the data (in my case often CP1250). Once the physical encoding in the database matches the &quot;logical&quot; encoding of the columns, it&#039;s possible to simply change the encoding of the columns (to UTF-8), and with the correct SET NAMES, you can have your webpage output anything you want (from UTF-8 to LATIN1).

When the data is double encoded (it originally was in UTF-8, it got reimported into the database as latin1 and then the encoding of the columns changed to UTF-8), you first have to set the encoding of the columns back to what it was when it was imported - this changes doubly encoded UTF-8 to physically singly encoded UTF-8 that the database thinks is in LATIN1 (for example), and then you go the route from TEXT (latin1) -&gt; BLOB -&gt; TEXT(UTF-8).

I think I have seen some scripts that try to do this automatically (by being really smart and getting information from the data dictionary), but for smaller scale databases such as wordpress, doing everything manually might be more tedious, but I think it&#039;s safer.

A few short points:
1. It is vital to get the physical encoding to match the encoding that is set in the table column type (I&#039;m not sure if search and replace will help because it works on already encoded data)
2. The encoding that is set in the HTML pages only determines what encoding the browser sends to PHP
3. PHP doesn&#039;t know (or, unfortunately, care) what encoding you get from the browser. GIGO.
4. The MySQL cares about the encoding of the data from the browser (and what encoding it sends back). Use &quot;SET ENCODING&quot; SQL command to tell the database this information (AFAIK, WP does this).
5. The database performs a lot of conversion behind the scenes - if the database thinks you send it data in latin1, but you have tables columns in UTF-8, it WILL do a conversion from &lt;em&gt;latin1 to utf-8&lt;/em&gt;, even if the data already was in UTF-8 (or worse, cp1250).
6. Changing the encoding of a column from one encoding to another performs physical recoding of the data, so you have to roundtrip it via BLOB or BINARY.
7. Once you try to convert two incompatible encodings, MySQL will insert a question mark (physically) for every character it can&#039;t convert (happens for example when changing between CP1250 and LATIN1, or importing UTF-8 data as UTF-8 data in table columns that have their encoding set to LATIN1).]]></description>
		<content:encoded><![CDATA[<p>comments: tl;dr, just that I&#8217;ve had exactly zero succes with dumping the database (via phpMyAdmin) and reimporting.</p>
<p>I&#8217;ve ran across this many times, the problem is that MySQL is encoding aware, and once you get bogus data in the database, no amount of recoding between different encodings or setting &#8220;set names&#8221; will help (in fact, the worst thing you can do is try to repair it by simply setting the correct encoding for your tables &#8211; if the target encoding doesn&#8217;t have a glyph, <em>it gets irreversibly replaced by ?</em>).</p>
<p>There&#8217;s a simple workaround that worked for me many times.</p>
<p>First, change the database column type  from TEXT (or varchar) to BLOB (or VARBINARY). This makes MySQL &#8220;forget&#8221; about any encoding it thinks the data is in, and prevents any recoding of the data that goes around behind the scenes (in my case, the data was often encoded in CP1250 or UTF-8, but column encoding was set to LATIN1).</p>
<p>Then you have to find out in what encoding the data is in, and reset the column type to text/varchar/char with the encoding that matches the physical encoding of the data (in my case often CP1250). Once the physical encoding in the database matches the &#8220;logical&#8221; encoding of the columns, it&#8217;s possible to simply change the encoding of the columns (to UTF-8), and with the correct SET NAMES, you can have your webpage output anything you want (from UTF-8 to LATIN1).</p>
<p>When the data is double encoded (it originally was in UTF-8, it got reimported into the database as latin1 and then the encoding of the columns changed to UTF-8), you first have to set the encoding of the columns back to what it was when it was imported &#8211; this changes doubly encoded UTF-8 to physically singly encoded UTF-8 that the database thinks is in LATIN1 (for example), and then you go the route from TEXT (latin1) -&gt; BLOB -&gt; TEXT(UTF-8).</p>
<p>I think I have seen some scripts that try to do this automatically (by being really smart and getting information from the data dictionary), but for smaller scale databases such as wordpress, doing everything manually might be more tedious, but I think it&#8217;s safer.</p>
<p>A few short points:<br />
1. It is vital to get the physical encoding to match the encoding that is set in the table column type (I&#8217;m not sure if search and replace will help because it works on already encoded data)<br />
2. The encoding that is set in the HTML pages only determines what encoding the browser sends to PHP<br />
3. PHP doesn&#8217;t know (or, unfortunately, care) what encoding you get from the browser. GIGO.<br />
4. The MySQL cares about the encoding of the data from the browser (and what encoding it sends back). Use &#8220;SET ENCODING&#8221; SQL command to tell the database this information (AFAIK, WP does this).<br />
5. The database performs a lot of conversion behind the scenes &#8211; if the database thinks you send it data in latin1, but you have tables columns in UTF-8, it WILL do a conversion from <em>latin1 to utf-8</em>, even if the data already was in UTF-8 (or worse, cp1250).<br />
6. Changing the encoding of a column from one encoding to another performs physical recoding of the data, so you have to roundtrip it via BLOB or BINARY.<br />
7. Once you try to convert two incompatible encodings, MySQL will insert a question mark (physically) for every character it can&#8217;t convert (happens for example when changing between CP1250 and LATIN1, or importing UTF-8 data as UTF-8 data in table columns that have their encoding set to LATIN1).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Eric Meyer</title>
		<link>http://meyerweb.com/eric/thoughts/2009/11/19/correcting-corrupted-characters/#comment-484440</link>
		<dc:creator>Eric Meyer</dc:creator>
		<pubDate>Sat, 21 Nov 2009 02:44:44 +0000</pubDate>
		<guid isPermaLink="false">http://meyerweb.com/eric/thoughts/?p=1214#comment-484440</guid>
		<description><![CDATA[Okay, now that&#039;s five recommendations to dump and re-import after I already tried that and it didn&#039;t work.  Made things much, much worse, in fact.

&lt;a href=&quot;http://meyerweb.com/eric/thoughts/2009/11/19/correcting-corrupted-characters/#comment-484401&quot; rel=&quot;nofollow&quot;&gt;Tantek&lt;/a&gt;, I sort of agree with you, but there are things WP does for me that hand-rolling wouldn&#039;t provide.  Like comments, for example, which I am emphatically &lt;em&gt;not&lt;/em&gt; willing to outsource to a third-party cloud service; and which simply listing inbound links does not come close to replicating.  Perhaps there are solutions now that would do all this but not rely on a database, but I don&#039;t remember seeing any back in 2004.

&lt;a href=&quot;http://meyerweb.com/eric/thoughts/2009/11/19/correcting-corrupted-characters/#comment-484409&quot; rel=&quot;nofollow&quot;&gt;Jeff&lt;/a&gt;, I believe that if I freshly installed WP in 2009, it would set things up using UTF-8 and there&#039;d be no issue.  I installed it almost six years ago, though.  Things have advanced a bit since then.]]></description>
		<content:encoded><![CDATA[<p>Okay, now that&#8217;s five recommendations to dump and re-import after I already tried that and it didn&#8217;t work.  Made things much, much worse, in fact.</p>
<p><a href="http://meyerweb.com/eric/thoughts/2009/11/19/correcting-corrupted-characters/#comment-484401" rel="nofollow">Tantek</a>, I sort of agree with you, but there are things WP does for me that hand-rolling wouldn&#8217;t provide.  Like comments, for example, which I am emphatically <em>not</em> willing to outsource to a third-party cloud service; and which simply listing inbound links does not come close to replicating.  Perhaps there are solutions now that would do all this but not rely on a database, but I don&#8217;t remember seeing any back in 2004.</p>
<p><a href="http://meyerweb.com/eric/thoughts/2009/11/19/correcting-corrupted-characters/#comment-484409" rel="nofollow">Jeff</a>, I believe that if I freshly installed WP in 2009, it would set things up using UTF-8 and there&#8217;d be no issue.  I installed it almost six years ago, though.  Things have advanced a bit since then.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
<!-- WP Super Cache is installed but broken. The path to wp-cache-phase1.php in wp-content/advanced-cache.php must be fixed! -->