bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gawk mbstate_t problem on hppa2.0w-hp-hpux11.11


From: Michael Elizabeth Chastain
Subject: Re: gawk mbstate_t problem on hppa2.0w-hp-hpux11.11
Date: Tue, 9 Dec 2003 23:31:14 -0500 (EST)

[Let's try the right mailing list this time.  Oops.]

Hi Stepan,

In my last email I outlined two proposals:

(A) Require both mbrtowc and mbstate_t

  (A1) Add some documentation to README_d/README.hpux which says that the
       user can turn on _XOPEN_SOURCE=500 if they want.

  (A2) If _XOPEN_SOURCE=500 is on, then both mbrtowc and mbstate_t are
       available.  Enable multi-byte support.

  (A3) If _XOPEN_SOURCE=500 is off, then mbrtowc is available, but
       mbstate_t is not.  Disable multi-byte support.

(B) Require mbrtowc, but mbstate_t is optional

  (B1) Add some documentation to README_d/README.hpux which says that the
       user can turn on _XOPEN_SOURCE=500 if they want.

  (B2) If _XOPEN_SOURCE=500 is on, then both mbrtowc and mbstate_t are
       available.  Enable multi-byte support.

  (B3) If _XOPEN_SOURCE=500 is off, then mbrtowc is available, but
       mbstate_t is not.  Enable multi-byte support without using
       mbstate_t.  Then, whenever an mbstate_t pointer is needed,
       provide "0" as the pointer value.

Gnu Readline actually does (B).  I liked (B) at first, but I spent some
time playing with it and I decided that I don't like it any more.

The problem is that gawk really wants to use mbstate_t.  If there is no
mbstate_t on the platform, and I make a fake one, then I can get gawk to
compile.  But a bunch of logic will not run as designed.

Look at strncasecmpmbs for example:

  int
  strncasecmpmbs(const char *s1, mbstate_t mbs1,
                 const char *s2, mbstate_t mbs2, size_t n)
  {
    ...
    for (...) {
      ...
      mbclen1 = mbrtowc(&wc1, s1 + i1, n - i1, &mbs1);
      ...
      mbclen2 = mbrtowc(&wc2, s2 + i2, n - i2, &mbs2);
      ...
    }
    ...
  }

This code will not work properly if mbs1 and mbs2 are nulled out
and both calls to mbrtowc use the builtin mbstate_t.

I still believe it's legal under Single Unix Spec v3 for a platform to
define mbrtowc and not define mbstate_t.  And it's a fact that some
platforms actually do that, whether it's legal or not.  And I think that
if an application processes only one string at a time and does not need
mbstate_t, then it can run on a platform like that.  Such an application
can test HAVE_MBRTOWC and explicitly call mbrtowc(..., ..., ..., 0).

But gawk is not such an application.

So my plan is to change this code:

  #if defined(HAVE_MBRLEN) && defined(HAVE_MBRTOWC) && defined(HAVE_WCHAR_H) && 
defined(HAVE_CTYPE_H)
  /* We can handle multibyte strings.  */
  #define MBS_SUPPORT
  #include <wchar.h>
  #include <wctype.h>
  #endif

To this:

  #if defined(HAVE_MBRLEN) && defined(HAVE_MBRTOWC) && defined(HAVE_MBSTATE_T) 
&& defined(HAVE_WCHAR_H) && defined(HAVE_CTYPE_H)
  ...

Then I have to test on hpux 11 with a variety of compilers and
with and without -D_XOPEN_SOURCE=500.  And then write some
documentation.

How does that sound?

Michael C




reply via email to

[Prev in Thread] Current Thread [Next in Thread]