gnewsense-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gNewSense-users] Cyrillic presentation in gNS wiki


From: Karl Goetz
Subject: Re: [gNewSense-users] Cyrillic presentation in gNS wiki
Date: Mon, 23 Feb 2009 12:53:05 +1030

On Thu, 19 Feb 2009 21:38:04 +0100
Sam Geeraerts <address@hidden> wrote:

> Karl Goetz schreef:
> > On Sun, 15 Feb 2009 11:41:36 +0100
> > Sam Geeraerts <address@hidden> wrote:
> > 
> >>> There's also a PmWiki recipe to convert input on the fly [2], but
> >>> I think it's only useful if the content is already in UTF-8. It
> >>> seems intended to catch input from a browser that is forced to
> >>> another encoding (or one that can't handle UTF-8).
> >>>
> >>> [1] http://www.pmwiki.org/wiki/Cookbook/UTF-8
> >>> [2] http://www.pmwiki.org/wiki/Cookbook/UTF8Conv
> >>>
> > 
> > We seem to have two options with PmWiki when it comes to charset to
> > use. Here's a snippet from our config:
> > 
> > $WikiTitle = 'PmWiki';
> > $Charset = 'ISO-8859-1';
> > $HTTPHeaders = array(
> >   "Expires: Tue, 01 Jan 2002 00:00:00 GMT",
> >   "Cache-Control: no-store, no-cache, must-revalidate",
> >   "Content-type: text/html; charset=ISO-8859-1;");
> > $CacheActions = array('browse','diff','print');
> > 
> > I can change either or both of these, but I'm not sure what the
> > consequences would be ...
> > kk
> > 
> 
> Grmbl, Charset is not documented (yet) [1]. I would have added a 
> placeholder as suggested, but I'm not sure if I'm supposed to do that
> in [1] or in [2].

Thanks for doing this research for us.

> 
> Anyway, I grepped through the code and it looks like Charset is the 
> encoding in the meta-element (or xml declaration). So both Charset
> and HTTPHeaders should be changed after a conversion. I don't know
> much about PHP, but it seems more sensible to reuse Charset in
> HTTPHeaders. If that is valid then a bug report is in order.

It would make more sense. 

> 
> I also stumbled upon some interesting comments to consider before
> using UTF-8 (in scripts/xlpage-utf-8.php):
> 
>      This script configures PmWiki to use utf-8 in page content and
>      pagenames.  There are some unfortunate side effects about PHP's
>      utf-8 implementation, however.  First, since PHP doesn't have a
>      way to do pattern matching on upper/lowercase UTF-8 characters,
>      WikiWords are limited to the ASCII-7 set, and all links to page
>      names with UTF-8 characters have to be in double brackets.
>      Second, we have to assume that all non-ASCII characters are valid
>      in pagenames, since there's no way to determine which UTF-8
>      characters are "letters" and which are punctuation.

That sounds like a problem.
Guess we'll have to look into encodings that are not utf-8.
I have a test wiki here, but its a different version to the current
live site (i'll need to setup a copy of the live site pre testing all
these settings out).
kk

> 
> [1] http://pmwiki.org/wiki/PmWiki/Variables
> [2] http://pmwiki.org/wiki/PmWiki/I18nVariables


-- 
Karl Goetz, (Kamping_Kaiser / VK5FOSS)
Debian user / gNewSense contributor
http://www.kgoetz.id.au
No, I won't join your social networking group

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]