bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

new module 'mbrtoc32'


From: Bruno Haible
Subject: new module 'mbrtoc32'
Date: Sat, 04 Jan 2020 02:45:18 +0100
User-agent: KMail/5.1.3 (Linux/4.4.0-170-generic; KDE/5.18.0; x86_64; ; )

mbrtoc32 is like mbrtowc, except that it produces a 32-bit wide character.
So, its use will fix the inability, on Windows and 32-bit AIX platforms,
to handle Unicode characters outside the BMP.

The implementation is a bit tricky: For encodings other than UTF-8 and
GB18030, we know that only characters in the BMP occur, therefore the
(possibly overridden) mbrtowc function does what mbrtoc32 needs. In
this case, we must NOT make assumptions about the wide character encoding.
For UTF-8, on the other hand, we can assume that the wide character
encoding is the Unicode code point, thus we can add ad-hoc code to
handle this case.
For GB18030, we would have a problem if we don't know the wide character
encoding. Fortunately this case does not occur.

Tested on glibc, musl libc, macOS, FreeBSD, NetBSD, OpenBSD, AIX, HP-UX,
IRIX, Solaris, Cygwin, mingw, MSVC, Haiku, Minix.


2020-01-03  Bruno Haible  <address@hidden>

        mbrtoc32: Add tests.
        * tests/test-mbrtoc32.c: New file, based on tests/test-mbrtowc.c.
        * tests/test-mbrtoc32-1.sh: New file, based on tests/test-mbrtowc1.sh.
        * tests/test-mbrtoc32-2.sh: New file, based on tests/test-mbrtowc2.sh.
        * tests/test-mbrtoc32-3.sh: New file, based on tests/test-mbrtowc3.sh.
        * tests/test-mbrtoc32-4.sh: New file, based on tests/test-mbrtowc4.sh.
        * tests/test-mbrtoc32-5.sh: New file, based on tests/test-mbrtowc5.sh.
        * tests/test-mbrtoc32-w32.c: New file, based on 
tests/test-mbrtowc-w32.c.
        * tests/test-mbrtoc32-w32-1.sh: New file, based on
        tests/test-mbrtowc-w32-1.sh.
        * tests/test-mbrtoc32-w32-2.sh: New file, based on
        tests/test-mbrtowc-w32-2.sh.
        * tests/test-mbrtoc32-w32-3.sh: New file, based on
        tests/test-mbrtowc-w32-3.sh.
        * tests/test-mbrtoc32-w32-4.sh: New file, based on
        tests/test-mbrtowc-w32-4.sh.
        * tests/test-mbrtoc32-w32-5.sh: New file, based on
        tests/test-mbrtowc-w32-5.sh.
        * tests/test-mbrtoc32-w32-6.sh: New file, based on
        tests/test-mbrtowc-w32-6.sh.
        * tests/test-mbrtoc32-w32-7.sh: New file, based on
        tests/test-mbrtowc-w32-7.sh.
        * modules/mbrtoc32-tests: New file, based on modules/mbrtowc-tests.

        mbrtoc32: New module.
        * lib/uchar.in.h (mbrtoc32): New declaration.
        * lib/mbrtoc32.c: New file, based on lib/mbrtowc.c.
        * m4/mbrtoc32.m4: New file, based on m4/mbrtowc.m4.
        * m4/uchar.m4 (gl_UCHAR_H): Test whether mbrtoc32 is declared.
        (gl_UCHAR_H_DEFAULTS): Initialize GNULIB_MBRTOC32, HAVE_MBRTOC32,
        REPLACE_MBRTOC32.
        * modules/uchar (Makefile.am): Substitute GNULIB_MBRTOC32,
        HAVE_MBRTOC32, REPLACE_MBRTOC32.
        * modules/mbrtoc32: New file, based on modules/mbrtowc.
        * tests/test-uchar-c++.cc (mbrtoc32): Verify the signature.
        * modules/uchar-c++-tests (Makefile.am): Link test-uchar-c++ with
        $(LIB_MBRTOWC).
        * doc/posix-functions/mbrtoc32.texi: Document the new module.
        * doc/posix-functions/mbrtowc.texi: Mention the new module.

2020-01-03  Bruno Haible  <address@hidden>

        mbrtowc: Refactor to share code with mbrtoc32.
        * lib/mbrtowc-impl.h: New file, extracted from lib/mbrtowc.c.
        * lib/mbrtowc-impl-utf8.h: Likewise.
        * lib/mbrtowc.c (mbrtowc): Define macro FITS_IN_CHAR_TYPE. Include
        mbrtowc-impl.h.
        * modules/mbrtowc (Files): Add the new files.

2020-01-03  Bruno Haible  <address@hidden>

        mbrtowc: Refactor locale charset dispatching.
        * lib/lc-charset-dispatch.h: New file, extracted from lib/mbrtowc.c.
        * lib/lc-charset-dispatch.c: New file, extracted from lib/mbrtowc.c.
        * lib/mbrtowc.c: Include lc-charset-dispatch.h. Don't include
        localcharset.h, streq.h.
        (enc_t): Remove type.
        (locale_enc): Remove function.
        (cached_locale_enc): Remove variable.
        (locale_enc_cached): Remove function.
        (mbrtowc): Invoke locale_encoding_classification.
        * m4/mbrtowc.m4 (gl_PREREQ_MBRTOWC): Update comment.
        * modules/mbrtowc (Files): Add lc-charset-dispatch.h,
        lc-charset-dispatch.c.
        (configure.ac): Arrange to compile lc-charset-dispatch.c.

Attachment: 0001-mbrtowc-Refactor-locale-charset-dispatching.patch
Description: Text Data

Attachment: 0003-mbrtowc-Refactor-to-share-code-with-mbrtoc32.patch
Description: Text Data

Attachment: 0004-mbrtoc32-New-module.patch
Description: Text Data

Attachment: 0005-mbrtoc32-Add-tests.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]