[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#28038: expand(1) lacks MBC support
From: |
Assaf Gordon |
Subject: |
bug#28038: expand(1) lacks MBC support |
Date: |
Fri, 11 Aug 2017 17:58:48 -0600 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 |
Hello Tilman,
On 10/08/17 10:10 AM, Tilman Schmidt wrote:
> it seems the expand(1) command does not properly support multi-byte
> characters.
That is correct.
> address@hidden:~$ cat test.txt
> Text ohne Umlaute
> Täxt müt Umläuten
> address@hidden:~$ expand test.txt
> Text ohne Umlaute
> Täxt müt Umläuten
>
> Using Ubuntu 14.04.5 LTS with coreutils 8.21-1ubuntu.
Multibyte support is not available yet (neither in version 8.21 which is
4 years old, nor in the current version 8.27).
However, there is an on-going effort to add multibyte support
to all coreutils programs, including 'expand'.
You can read more technical details about it here:
http://crashcourse.housegordon.org/coreutils-multibyte-support.html
In the current (work-in-progress) internationalization patch,
the 'expand' program does support multibyte locales, and expands
your input correctly:
multibyte locale:
$ ./src/expand bug28038.txt
Text ohne Umlaute
Täxt müt Umläuten
versus forcing single-byte locale:
$ LC_ALL=C ./src/expand bug28038.txt
Text ohne Umlaute
Täxt müt Umläuten
The latest version of the patch is available for download and
experimentation here:
http://lists.gnu.org/archive/html/coreutils/2017-04/msg00009.html
However it should not be considered stable.
regards,
- assaf