[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Grep-devel] handling of non-BMP characters
From: |
Jim Meyering |
Subject: |
Re: [Grep-devel] handling of non-BMP characters |
Date: |
Wed, 19 Dec 2018 09:21:17 -0800 |
On Tue, Dec 18, 2018 at 11:51 PM Bruno Haible <address@hidden> wrote:
> Corinna Vinschen wrote in
> <https://lists.gnu.org/archive/html/grep-devel/2018-12/msg00039.html>:
> > it would be
> > pretty nice if that code could get reverted back in to support
> > non-BMP charsets even on Cygwin.
>
> I agree that support for beyond-BMP characters should be added back to 'grep'.
>
> Your earlier fix from 2013-08-16 (and the fact that the test failure is
> occurring exactly on Windows and AIX platforms) shows that the problem is
> with wchar_t being only 16-bit wide on these platforms.
>
> The type 'char32_t' has been introduced in C11 to overcome this limitation.[1]
>
> I propose to
>
> 1) introduce in gnulib support for <uchar.h>, char32_t, and mbrtoc32, so
> that we can use these instead of <wchar.h>, wchar_t, and mbrtowc
> portably,
>
> 2) change those gnulib modules that don't behave well with beyond-BMP
> characters on Windows and AIX to use char32_t instead of wchar_t.
>
> Then the 'grep' code can be changed in a similar way, and this will
> fix the bug on Cygwin and AIX (though not on native Windows [2]).
Sounds perfect. Thank you!
- Re: [Grep-devel] handling of non-BMP characters, (continued)
- Re: [Grep-devel] handling of non-BMP characters, Corinna Vinschen, 2018/12/16
- Re: [Grep-devel] handling of non-BMP characters, Jim Meyering, 2018/12/16
- Re: [Grep-devel] handling of non-BMP characters, Corinna Vinschen, 2018/12/16
- Re: [Grep-devel] handling of non-BMP characters, Corinna Vinschen, 2018/12/16
- Re: [Grep-devel] handling of non-BMP characters, Bruno Haible, 2018/12/19
- Re: [Grep-devel] handling of non-BMP characters, Corinna Vinschen, 2018/12/19
- Re: [Grep-devel] handling of non-BMP characters, Corinna Vinschen, 2018/12/19
- Re: [Grep-devel] handling of non-BMP characters,
Jim Meyering <=
- Re: [Grep-devel] handling of non-BMP characters, Paul Eggert, 2018/12/19
- Re: [Grep-devel] handling of non-BMP characters, arnold, 2018/12/20
- Re: [Grep-devel] handling of non-BMP characters, Bruno Haible, 2018/12/20
Re: [Grep-devel] [platform-testers] new snapshot available: grep-3.1.46-504af, Bruno Haible, 2018/12/15
Re: [Grep-devel] [platform-testers] new snapshot available: grep-3.1.46-504af, Bruno Haible, 2018/12/15
Re: [Grep-devel] [platform-testers] new snapshot available: grep-3.1.46-504af, Bruno Haible, 2018/12/15