[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
mbrtowc tests: don't make assumptions about the charset the C locale
From: |
Bruno Haible |
Subject: |
mbrtowc tests: don't make assumptions about the charset the C locale |
Date: |
Sat, 24 Feb 2018 12:02:02 +0100 |
User-agent: |
KMail/5.1.3 (Linux/4.4.0-112-generic; KDE/5.18.0; x86_64; ; ) |
On Alpine Linux 3.7.0, which uses musl libc, this test fails:
FAIL: test-mbrtowc5.sh
======================
../../gltests/test-mbrtowc.c:106: assertion 'wc == c' failed
Aborted
FAIL test-mbrtowc5.sh (exit status: 134)
The issue is that in the C locale, musl uses the encoding that maps
0x00..0x7F -> U+0000..U+007F
0x80..0xFF -> U+DF80..U+DFFF
Whereas for older platforms it was natural to use the ISO-8859-1 encoding:
0x00..0x7F -> U+0000..U+007F
0x80..0xFF -> U+0080..U+00FF
This patch fixes the test.
2018-02-24 Bruno Haible <address@hidden>
mbrtowc tests: Don't make assumptions about the charset the C locale.
* tests/test-mbrtowc.c (main): For bytes >= 0x80, don't assume a
particular mapping in the C locale.
diff --git a/tests/test-mbrtowc.c b/tests/test-mbrtowc.c
index a0b5231..54d52f8 100644
--- a/tests/test-mbrtowc.c
+++ b/tests/test-mbrtowc.c
@@ -103,7 +103,15 @@ main (int argc, char *argv[])
wc = (wchar_t) 0xBADFACE;
ret = mbrtowc (&wc, buf, 1, &state);
ASSERT (ret == 1);
- ASSERT (wc == c);
+ if (c < 0x80)
+ /* c is an ASCII character. */
+ ASSERT (wc == c);
+ else
+ /* argv[1] starts with '5', that is, we are testing the C or POSIX
+ locale.
+ On most platforms, the bytes 0x80..0xFF map to U+0080..U+00FF.
+ But on musl libc, the bytes 0x80..0xFF map to U+DF80..U+DFFF.
*/
+ ASSERT (wc == btowc (c));
ASSERT (mbsinit (&state));
ret = mbrtowc (NULL, buf, 1, &state);
ASSERT (ret == 1);
- mbrtowc tests: don't make assumptions about the charset the C locale,
Bruno Haible <=