|
From: | Bruno Haible |
Subject: | new module 'mbrtoc32' |
Date: | Sat, 04 Jan 2020 02:45:18 +0100 |
User-agent: | KMail/5.1.3 (Linux/4.4.0-170-generic; KDE/5.18.0; x86_64; ; ) |
mbrtoc32 is like mbrtowc, except that it produces a 32-bit wide character. So, its use will fix the inability, on Windows and 32-bit AIX platforms, to handle Unicode characters outside the BMP. The implementation is a bit tricky: For encodings other than UTF-8 and GB18030, we know that only characters in the BMP occur, therefore the (possibly overridden) mbrtowc function does what mbrtoc32 needs. In this case, we must NOT make assumptions about the wide character encoding. For UTF-8, on the other hand, we can assume that the wide character encoding is the Unicode code point, thus we can add ad-hoc code to handle this case. For GB18030, we would have a problem if we don't know the wide character encoding. Fortunately this case does not occur. Tested on glibc, musl libc, macOS, FreeBSD, NetBSD, OpenBSD, AIX, HP-UX, IRIX, Solaris, Cygwin, mingw, MSVC, Haiku, Minix. 2020-01-03 Bruno Haible <address@hidden> mbrtoc32: Add tests. * tests/test-mbrtoc32.c: New file, based on tests/test-mbrtowc.c. * tests/test-mbrtoc32-1.sh: New file, based on tests/test-mbrtowc1.sh. * tests/test-mbrtoc32-2.sh: New file, based on tests/test-mbrtowc2.sh. * tests/test-mbrtoc32-3.sh: New file, based on tests/test-mbrtowc3.sh. * tests/test-mbrtoc32-4.sh: New file, based on tests/test-mbrtowc4.sh. * tests/test-mbrtoc32-5.sh: New file, based on tests/test-mbrtowc5.sh. * tests/test-mbrtoc32-w32.c: New file, based on tests/test-mbrtowc-w32.c. * tests/test-mbrtoc32-w32-1.sh: New file, based on tests/test-mbrtowc-w32-1.sh. * tests/test-mbrtoc32-w32-2.sh: New file, based on tests/test-mbrtowc-w32-2.sh. * tests/test-mbrtoc32-w32-3.sh: New file, based on tests/test-mbrtowc-w32-3.sh. * tests/test-mbrtoc32-w32-4.sh: New file, based on tests/test-mbrtowc-w32-4.sh. * tests/test-mbrtoc32-w32-5.sh: New file, based on tests/test-mbrtowc-w32-5.sh. * tests/test-mbrtoc32-w32-6.sh: New file, based on tests/test-mbrtowc-w32-6.sh. * tests/test-mbrtoc32-w32-7.sh: New file, based on tests/test-mbrtowc-w32-7.sh. * modules/mbrtoc32-tests: New file, based on modules/mbrtowc-tests. mbrtoc32: New module. * lib/uchar.in.h (mbrtoc32): New declaration. * lib/mbrtoc32.c: New file, based on lib/mbrtowc.c. * m4/mbrtoc32.m4: New file, based on m4/mbrtowc.m4. * m4/uchar.m4 (gl_UCHAR_H): Test whether mbrtoc32 is declared. (gl_UCHAR_H_DEFAULTS): Initialize GNULIB_MBRTOC32, HAVE_MBRTOC32, REPLACE_MBRTOC32. * modules/uchar (Makefile.am): Substitute GNULIB_MBRTOC32, HAVE_MBRTOC32, REPLACE_MBRTOC32. * modules/mbrtoc32: New file, based on modules/mbrtowc. * tests/test-uchar-c++.cc (mbrtoc32): Verify the signature. * modules/uchar-c++-tests (Makefile.am): Link test-uchar-c++ with $(LIB_MBRTOWC). * doc/posix-functions/mbrtoc32.texi: Document the new module. * doc/posix-functions/mbrtowc.texi: Mention the new module. 2020-01-03 Bruno Haible <address@hidden> mbrtowc: Refactor to share code with mbrtoc32. * lib/mbrtowc-impl.h: New file, extracted from lib/mbrtowc.c. * lib/mbrtowc-impl-utf8.h: Likewise. * lib/mbrtowc.c (mbrtowc): Define macro FITS_IN_CHAR_TYPE. Include mbrtowc-impl.h. * modules/mbrtowc (Files): Add the new files. 2020-01-03 Bruno Haible <address@hidden> mbrtowc: Refactor locale charset dispatching. * lib/lc-charset-dispatch.h: New file, extracted from lib/mbrtowc.c. * lib/lc-charset-dispatch.c: New file, extracted from lib/mbrtowc.c. * lib/mbrtowc.c: Include lc-charset-dispatch.h. Don't include localcharset.h, streq.h. (enc_t): Remove type. (locale_enc): Remove function. (cached_locale_enc): Remove variable. (locale_enc_cached): Remove function. (mbrtowc): Invoke locale_encoding_classification. * m4/mbrtowc.m4 (gl_PREREQ_MBRTOWC): Update comment. * modules/mbrtowc (Files): Add lc-charset-dispatch.h, lc-charset-dispatch.c. (configure.ac): Arrange to compile lc-charset-dispatch.c.
0001-mbrtowc-Refactor-locale-charset-dispatching.patch
Description: Text Data
0003-mbrtowc-Refactor-to-share-code-with-mbrtoc32.patch
Description: Text Data
0004-mbrtoc32-New-module.patch
Description: Text Data
0005-mbrtoc32-Add-tests.patch
Description: Text Data
[Prev in Thread] | Current Thread | [Next in Thread] |