bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#6707: cut is not multi byte (wide char) aware


From: Eric Blake
Subject: bug#6707: cut is not multi byte (wide char) aware
Date: Thu, 22 Jul 2010 17:15:39 -0600
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.10) Gecko/20100621 Fedora/3.0.5-1.fc13 Lightning/1.0b2pre Mnenhy/0.8.3 Thunderbird/3.0.5

On 07/22/2010 12:49 PM, Mihai Moldovan wrote:
> Hi,
> 
> I have come to notice that cut is not yet multi byte/wide char aware.

Yes, and so are a lot of the coreutils.  This is a well-known issue, and
mentioned in the TODO.  Several distros have add-on patches that add
wide char support, but to date, no one has yet submitted a patch
upstream that is both easy to maintain (doesn't needlessly duplicate big
blocks of code over char vs. wchar_t) and which doesn't penalize speed
on single-byte locales.  We've got some ideas on what is needed, and
gnulib is certainly getting closer to what we need (Bruno's work on
libunistring will be a key player in an acceptable patch), but it takes
time to pull it all together.

> (Is this even considerable as a bug, or just a "feature" in that only
> one byte delimiters are allowed by default?)

Yes, it can be considered a bug, and any extra help would be welcome.
Unfortunately, to date there has been no one willing to step forward to
scratch this itch as their highest priority.

-- 
Eric Blake   address@hidden    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]