[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Koha-devel] Investigations on Perl, MySQL & UTF-8
From: |
Henri-Damien LAURENT |
Subject: |
Re: [Koha-devel] Investigations on Perl, MySQL & UTF-8 |
Date: |
Fri, 10 Mar 2006 14:05:10 +0100 |
User-agent: |
Thunderbird 1.5 (X11/20051201) |
Pierrick LE GALL a écrit :
> Hi koha-devel,
>
> Because the story of Perl, MySQL, UTF-8 and Koha is becoming more and
> more complicated, I've decided to start my tests outside of Koha or any
> web server. I wanted to check that Perl and MySQL could communicate
> with UTF-8 data.
>
> What I did :
>
> 1. copy some UTF-8 strings from
> http://www.columbia.edu/kermit/utf8-t1.html paste into a UTF-8 text
> file utf8.txt (open/past in UTF-8 console, with Vim having :set
> encoding=utf-8)
>
> 2. create a UTF-8 database with a simple table having a TEXT field
>
> $ mysql --user=root --password=xxx
> mysql> CREATE DATABASE `utf8_test` CHARACTER SET utf8;
> mysql> connect utf8_test
> mysql> create table strings (id int, value text);
> mysql> quit
>
> (no need to set connection character set to utf-8 in that case, default
> latin1 is fine)
>
> Note: my MySQL server is latin1...
>
> $ mysql --user=root --password=xxx utf8_test
> mysql> status
> Server characterset: latin1
> Db characterset: utf8
> Client characterset: latin1
> Conn. characterset: latin1
> mysql> set names 'UTF8';
> mysql> status
> Server characterset: latin1
> Db characterset: utf8
> Client characterset: utf8
> Conn. characterset: utf8
>
> 3. write and execute a Perl script which reads the UTF-8 text file,
> insert UTF-8 strings into database, retrieve UTF-8 strings from
> database, print UTF-8 strings to STDOUT. See details in attached file
> readfile_insertdb.pl. Important note: "set names 'UTF8';" is mandatory.
>
> Everything is *working fine*. My output is in UTF-8, I'm 100% sure of
> it.
>
> DBD::mysql : 2.9007
> Perl : 5.8.7
> MySQL : 4.1.12-Debian_1ubuntu3.1-log
> DBI : 1.48
>
> (find your local versions with attached script versions.pl)
>
> I suspect that Paul's data stored in MySQL are not truely UTF-8. Maybe
> I miss the point, but it seems Perl, MySQL and UTF-8 are not working so
> badly altogether.
>
> Cheers,
>
>
WOW.
Indeed, frenchies can have some explanations about set names here :
http://doc.domainepublic.net/mysql/doc.mysql/charset-connection.html
here comes the english version :
http://dev.mysql.com/doc/refman/4.1/en/charset-connection.html
clear, when you know what to search for :)
I will test myself.
--
Henri-Damien LAURENT