[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Fix wrong character count in argp
From: |
Vladimir 'φ-coder/phcoder' Serbinenko |
Subject: |
Re: Fix wrong character count in argp |
Date: |
Sun, 12 Feb 2012 23:41:41 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20120131 Thunderbird/10.0 |
On 12.02.2012 22:59, Bruno Haible wrote:
> Hi Vladimir,
>
> Thank you for the proposed patch.
>
>> As already reported several years ago
> I cannot find it in my archives. Maybe that discussion already contained
> some useful thoughts or arguments? Can you please point me to it?
http://lists.gnu.org/archive/html/bug-tar/2009-09/msg00008.html
>> argp counts bytes even when
>> actually what matters is the display length. This patch improves the
>> situation by counting only leading and standalone UTF-8 bytes. It
>> doesn't handle the double-width characters like Chinese sinograms
> A program that needs to consider display length - for example for
> line wrapping - should
> 1) work with any locale encoding. Don't assume that the locale encoding
> is UTF-8.
> 2) work with Chinese ideographs correctly, like it should also work
> with Russian (single-width) letters.
Here you go. Tested on Cyrillic. Haven't tested with Chinese.
>
> The easiest way to satisfy these two requirements is to base the code on
> either
> * the function mbswidth (gnulib module mbswidth) and possibly also mbiter
> or mbuiter, or
> * the gnulib module unilbrk/ulc-width-linebreaks, it contains a complete
> line-breaking algorithm.
>
> Can you rewrite your patch to this effect?
>
> Also, such tricky issues should be checked in the test suite. Can you
> please also provide a test program, some input data, and the expected
> output for this data? We can then turn it into a gnulib test.
I can strip down one of our programs to keep just --help when time permits.
> Bruno
>
>
--
Regards
Vladimir 'φ-coder/phcoder' Serbinenko
argp.diff
Description: Text Data
signature.asc
Description: OpenPGP digital signature