nuxeo-localizer
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nuxeo-localizer] Non-MessageCatalog UTF-8 strings get encoded again


From: Juan David Ibáñez Palomar
Subject: Re: [Nuxeo-localizer] Non-MessageCatalog UTF-8 strings get encoded again
Date: Tue, 15 Oct 2002 12:55:53 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020913 Debian/1.1-1

Sean Treadway wrote:

Juan,

Thank you for the work around, it works like a charm with one addition.
I needed to also encode the output of the Localizer.changeLanguageForm()
method to utf-8.

What should we do to fix this?  I think it is a bug that Zope assumes
custom content be encoded in Latin-1, but that has been the default
encoding for most of the strings up until now.  I see three solutions:

Convert all my UTF-8 strings to Unicode.  This seems like the 'correct'
solution because it removes any ambiguity, and I contain the knowledge
of which strings are UTF-8 (my custom objects) and which are not (Zope's
stock objects).


Sure, your application would be "better" at the end. But
I think Zope/Localizer should provide an smooth upgrade.


Update Zope to upgrade all non-unicode strings with user defined
encoding instead of Latin-1.


This is the good one.


Update the Localizer to return strings in a user defined encoding
instead of Unicode.  Given that Localizer is one of the first products
to make heavy use of Unicode this would be a good place to add a fix for
my application.  In the management pages, there is an option to select
the character set of the PO files.  It would be a logical place to
specify the characters set of returned strings.

This would work for the message catalog, but it wouldn't
for local content objects. Well, it could, but the default
encoding should be defined somewhere else.

Anyway, it's not the right thing to do. The transformation
from Unicode to normal strings must be done when it's needed,
that is, when the response to the browser is built. The rest
of the time Unicode strings should be used as much as posible.


I will file a bug against Zope within a couple of days.  In the meantime
the workaround for my site is operational so I would not update
Localizer.  I think Localizer is doing the 'right thing' returning
Unicode strings and any change would be counter-productive in the long
run.  This behavior and work-around was excellently described by you and
should be in Localizer's documentation.


Yes, a bug report in the collector would be a really good
thing.

And I will add this to the docs if it's not solved.


Thanks again for a quick and accurate response!

-Sean

-----Original Message-----
From: address@hidden [mailto:nuxeo-
address@hidden On Behalf Of Juan David
Ibáñez
Palomar
Sent: Tuesday, October 15, 2002 10:49 AM
To: address@hidden
Subject: Re: [Nuxeo-localizer] Non-MessageCatalog UTF-8 strings get
encoded again


I forgot to give you a short term solution.

Create the message catalog with the id "mc". Create a Python
Script with the id "gettext", it would be like:

def gettext(message, lang=None):
   translation = container.mc(message, lang)
   return translation.encode('utf-8')

Then use the script instead of calling the message catalog
directly. When this problem is fixed you just will remove
the Python script and rename the message catalog.

This way you can continue building your web site and don't
have to wait for anything else.


Regards,



--
J. David Ibáñez, http://www.j-david.net
Software Engineer / Ingénieur Logiciel / Ingeniero de Software






reply via email to

[Prev in Thread] Current Thread [Next in Thread]