bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#28038: expand(1) lacks MBC support


From: Assaf Gordon
Subject: bug#28038: expand(1) lacks MBC support
Date: Fri, 11 Aug 2017 17:58:48 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1

Hello Tilman,

On 10/08/17 10:10 AM, Tilman Schmidt wrote:
> it seems the expand(1) command does not properly support multi-byte
> characters.

That is correct.

> address@hidden:~$ cat test.txt
> Text  ohne    Umlaute
> Täxt  müt     Umläuten
> address@hidden:~$ expand test.txt
> Text    ohne    Umlaute
> Täxt   müt    Umläuten
> 
> Using Ubuntu 14.04.5 LTS with coreutils 8.21-1ubuntu.

Multibyte support is not available yet (neither in version 8.21 which is
4 years old, nor in the current version 8.27).

However, there is an on-going effort to add multibyte support
to all coreutils programs, including 'expand'.

You can read more technical details about it here:
  http://crashcourse.housegordon.org/coreutils-multibyte-support.html

In the current (work-in-progress) internationalization patch,
the 'expand' program does support multibyte locales, and expands
your input correctly:

multibyte locale:

   $ ./src/expand bug28038.txt
   Text    ohne    Umlaute
   Täxt    müt     Umläuten

versus forcing single-byte locale:

   $ LC_ALL=C ./src/expand bug28038.txt
   Text    ohne    Umlaute
   Täxt   müt    Umläuten


The latest version of the patch is available for download and
experimentation here:
  http://lists.gnu.org/archive/html/coreutils/2017-04/msg00009.html
However it should not be considered stable.

regards,
 - assaf







reply via email to

[Prev in Thread] Current Thread [Next in Thread]