--- Begin Message ---
Subject: |
[PATCH] tests: fix encoding with `tr' to support multibyte in test |
Date: |
Sat, 08 Nov 2014 17:07:40 +0900 |
It seems that `tr' in GNU coreutils does not recoginize multibyte
character, but other imprementation, e.g. HP-UX, Solaris, recoginizes it.
As a result, [ echo AB | LC_ALL=ja_JP.eucJP tr AB '\244\263' ] is
transformed as [ echo AB | LC_ALL=ja_JP.eucJP tr A '\244\263' ], so that
'\244\263' is recognized as a single multibyte character. We do not
expect that.
0001-grep-fix-encoding-with-tr-to-support-multibyte-in-te.patch
Description: Text document
--- End Message ---
--- Begin Message ---
Subject: |
Re: bug#18991: [PATCH] tests: fix encoding with `tr' to support multibyte in test |
Date: |
Sat, 8 Nov 2014 19:00:55 -0800 |
On Sat, Nov 8, 2014 at 12:07 AM, Norihiro Tanaka <address@hidden> wrote:
> It seems that `tr' in GNU coreutils does not recoginize multibyte
> character, but other imprementation, e.g. HP-UX, Solaris, recoginizes it.
>
> As a result, [ echo AB | LC_ALL=ja_JP.eucJP tr AB '\244\263' ] is
> transformed as [ echo AB | LC_ALL=ja_JP.eucJP tr A '\244\263' ], so that
> '\244\263' is recognized as a single multibyte character. We do not
> expect that.
Thank you for the report and patch.
However, it is not maintainable to modify every use of "tr" in
the tests. Instead, I've addressed this by making all of the
tests use tr through a wrapper that always sets LC_ALL=C:
0001-tests-avoid-a-multibyte-tr-portability-problem.patch
Description: Binary data
--- End Message ---