[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Koha-devel] Investigations on Perl, MySQL & UTF-8

From: Pierrick LE GALL
Subject: [Koha-devel] Investigations on Perl, MySQL & UTF-8
Date: Fri, 10 Mar 2006 12:49:11 +0100

Hi koha-devel,

Because the story of Perl, MySQL, UTF-8 and Koha is becoming more and
more complicated, I've decided to start my tests outside of Koha or any
web server. I wanted to check that Perl and MySQL could communicate
with UTF-8 data.

What I did :

1. copy some UTF-8 strings from
http://www.columbia.edu/kermit/utf8-t1.html paste into a UTF-8 text
file utf8.txt (open/past in UTF-8 console, with Vim having :set

2. create a UTF-8 database with a simple table having a TEXT field

$ mysql --user=root --password=xxx
mysql> CREATE DATABASE `utf8_test` CHARACTER SET utf8;
mysql> connect utf8_test
mysql> create table strings (id int, value text);
mysql> quit

(no need to set connection character set to utf-8 in that case, default
latin1 is fine)

Note: my MySQL server is latin1...

$ mysql --user=root --password=xxx utf8_test
mysql> status
Server characterset:    latin1
Db     characterset:    utf8
Client characterset:    latin1
Conn.  characterset:    latin1
mysql> set names 'UTF8';
mysql> status
Server characterset:    latin1
Db     characterset:    utf8
Client characterset:    utf8
Conn.  characterset:    utf8

3. write and execute a Perl script which reads the UTF-8 text file,
insert UTF-8 strings into database, retrieve UTF-8 strings from
database, print UTF-8 strings to STDOUT. See details in attached file
readfile_insertdb.pl. Important note: "set names 'UTF8';" is mandatory.

Everything is *working fine*. My output is in UTF-8, I'm 100% sure of

DBD::mysql : 2.9007
      Perl : 5.8.7
     MySQL : 4.1.12-Debian_1ubuntu3.1-log
       DBI : 1.48

(find your local versions with attached script versions.pl)

I suspect that Paul's data stored in MySQL are not truely UTF-8. Maybe
I miss the point, but it seems Perl, MySQL and UTF-8 are not working so
badly altogether.


Pierrick LE GALL
INEO media system

Attachment: readfile_insertdb.pl
Description: Perl program

Attachment: versions.pl
Description: Perl program

reply via email to

[Prev in Thread] Current Thread [Next in Thread]