[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] IBM z/OS + EBCDIC support
Re: [PATCH] IBM z/OS + EBCDIC support
Tue, 22 Sep 2015 09:23:34 -0600
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0
On 09/21/2015 08:28 PM, Daniel Richard G. wrote:
> Hello list,
> The attached patch, against Git master, addresses numerous
> incompatibilities in Gnulib with IBM z/OS (a mainframe operating system)
> and the EBCDIC encoding.
> With my changes, Gnulib builds successfully, and most of the tests
> succeed. The remaining failures are as follows.
Thanks for the work. Can you please split the patch into a series of
multiple pieces, one patch per issue, so that we can apply the
obviously-correct ones while still discussing the other pieces, rather
than holding the entire large patch hostage to review?
Also, while I see you have copyright assignment on file for Gawk, I
don't see it for gnulib. You'll want to repeat the assignment process
for gnulib before we can take more than the most trivial patches.
Some quick comments, without having reviewed any code:
> * In EBCDIC, normal chars like 'A' occur in the upper half of the 8-bit
> range. This interferes with the idiom of using "switch (c)" and then
> "case 'A':" et al. because c can have two distinct values (-63 and
> 193) that should match to 'A'.
> My fix, then, is a macro which converts the input codepoint to the
> range that will match literal chars, when necessary. (Obviously, in
> ASCII, it's a no-op.) Any takers on a better name for this macro than
coreutils uses to_uchar() to force the conversion of a byte to an
unsigned character, useful for cases where sign extension of a byte is
not desired. Sounds like it does the same thing as what you are doing here.
> +++ lib/math.in.h
> * The system defines these functions as macros, and the compiler did not
> like seeing them redefined.
No underlying functions with linkage? POSIX generally requires that, so
you may want to submit a bug, but it's certainly not the first time
we've worked around that.
> +++ lib/regex.h
> * Ensure that "__string" does not expand to "1" when it is used as a
> formal parameter name.
Sounds like we shouldn't be naming our formal parameter __string, since
that's a name reserved to the internal implementation namespace.
> +++ m4/strstr.m4
> * The IBM runtime sucks; signal delivery is delayed until strstr()
> exits, so this test results in a hang that can only be SIGKILL'ed.
Not a hang, just a reallllllly long execution time; and all because the
libc implementation is O(n^2) instead of O(n). But they really block
signals during the call? Ouch.
> +++ tests/nan.h
> * z/OS, in addition to supporting IEEE floating-point, also supports an
> older "hexadecimal" format that does not support NaN. Bomb out if this
> is in use.
C, and POSIX, allow for platforms without NaN (in part because of cases
like the z/OS non-IEEE mode). I'm not surprised if we have baked in
assumptions that don't hold when IEEE is not around.
> +++ tests/test-c-strcasecmp.c
> * In EBCDIC-1047, the tests
> ASSERT (c_strcasecmp ("turkish", "TURK\304\260SH") < 0);
> ASSERT (c_strcasecmp ("TURK\304\260SH", "turkish") > 0);
> are actually
> ASSERT (c_strcasecmp ("turkish", "TURKD¬SH") < 0);
> ASSERT (c_strcasecmp ("TURKD¬SH", "turkish") > 0);
> which, of course, fail.
Basically, EBCDIC lacks the Turkish i, and since it is not a UTF-8
locale, we should probably be skipping the test in that environment.
> +++ tests/test-canonicalize-lgpl.c
> * Addressed a strange z/OS corner case. This system has
> DOUBLE_SLASH_IS_DISTINCT_ROOT, yet the dev/ino numbers are the same.
What? Does that mean 'ls -a /' and 'ls -a //' see different contents?
If they do, then sharing dev/ino is a bug; if they are identical, then
DOUBLE_SLASH_IS_DISTINCT_ROOT is defined incorrectly.
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
Description: OpenPGP digital signature
Re: [PATCH] IBM z/OS + EBCDIC support, Paul Eggert, 2015/09/22