lists.zerezo.com
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
***BOGO*** Re: convertion to utf-8
- Date: Thu, 3 Jul 2008 21:10:56 +0100
- From: Pooly <pooly7@xxxxxxxxx>
- Subject: ***BOGO*** Re: convertion to utf-8
2008/7/1 Dan Nelson <dnelson@xxxxxxxxxxxxxxx>:
> In the last episode (Jun 30), Pooly said:
>> 2008/6/30 Dan Nelson <dnelson@xxxxxxxxxxxxxxx>:
>> > In the last episode (Jun 29), Pooly said:
>> >> Hi,
>> >>
>> >> I'm trying to convert my tables to UTF8 but I'm getting the
>> >> following error: ERROR 1062 (23000): Duplicate entry 'Zorglüb' for
>> >> key 1
>> >>
>> >> Not too sure why I'm getting this error since the current (latin1)
>> >> data are:
>> >>
>> >> mysql> select * from topics_lookup where label like 'Zor%';
>> >> +----------+----------+------+
>> >> | label | topic_id | main |
>> >> +----------+----------+------+
>> >> | Zorglub | 72 | 0 |
>> >> | Zorglüb | 72 | 1 |
>> >> +----------+----------+------+
>> >> 2 rows in set (0.00 sec)
>> >>
>> >> There is a unique index on label, however the 2 data are different.
>> >>
>> >> Any ideas ?
>> >
>> > I can't reproduce this. Can you provide example commands
>> > demonstrating your problem?
>>
>> Yes, sorry I should have been more precise in my email.
>>
>> mysql> select version();
>> +--------------------------+
>> | version() |
>> +--------------------------+
>> | 5.0.32-Debian_7etch5-log |
>> +--------------------------+
>> 1 row in set (0.00 sec)
>>
>> create table mytable2 ( label varchar(200) primary key ) charset latin1;
>> insert into mytable2 values ('Zorglub'), ('Zorglüb');
>> alter table mytable2 convert to character set utf8 collate utf8_general_ci;
>>
>> this gives:
>> ERROR 1062 (23000): Duplicate entry 'Zorglüb' for key 1
>>
>> I tried to search the changelog and the bug tracking system, but
>> without much luck.
>
> Mysql's default collation is latin1_swedish_ci, which sorts ü along
> with y. utf8_general_ci sorts it along with u:
>
> http://www.collation-charts.org/mysql60/mysql604.latin1_swedish_ci.html
> http://www.collation-charts.org/mysql60/mysql604.utf8_general_ci.european.html
>
> More reading:
>
> http://dev.mysql.com/doc/refman/5.0/en/charset-unicode-sets.html
>
> ... To further illustrate, the following equalities hold in both
> utf8_general_ci and utf8_unicode_ci (for the effect this has in
> comparisons or when doing searches, see Section 9.1.5.6, "Examples of
> the Effect of Collation"):
>
> Ä = A
> Ö = O
> Ü = U
>
Thanks for the link and the detailled explanation. It's all clear now
with the collation, and I now what to do with my data.
Cheers,
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=royale@xxxxxxxxxx