bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: dfa - gawk matching problem on windows and suggested fix


From: Jim Meyering
Subject: Re: dfa - gawk matching problem on windows and suggested fix
Date: Mon, 03 Oct 2011 13:27:12 +0200

Eli Zaretskii wrote:
>> From: Jim Meyering <address@hidden>
>> Cc: address@hidden,  address@hidden
>> Date: Mon, 03 Oct 2011 10:17:44 +0200
>>
>> Eli Zaretskii wrote:
>>
>> >> From: Jim Meyering <address@hidden>
>> >> Cc: address@hidden,  address@hidden
>> >> Date: Sun, 02 Oct 2011 21:45:01 +0200
>> >>
>> >> Eli, can you confirm that this also solves the problem
>> >
>> > No, it doesn't.  The branch of the code that calls wctob was where the
>> > trouble was happening to begin with.  The patch below, which still
>> > goes through a temporary `unsigned char' variable, does work.
>>
>> Can you explain or demonstrate how wctob's "int" return
>> value was inappropriately sign-extended?
>
> I get a negative value for 0x95 from `lex'.  An explicit `fprintf'
> after this line:
>
>             (c) = wctob(wc);
>
> shows that the value of `c' is -107.  The value returned by wctob, if
> printed using %d is -107, and if printed with %x, shows as 0xffffff95.

That shows the problem is with the Windows wctob implementation.
What if you include something like this just above?
(this is part of gnulib's wctob replacement, lib/wctob.c)

#define wctob rpl_wctob

int
wctob (wint_t wc)
{
  char buf[64];

  if (!(MB_CUR_MAX <= sizeof (buf)))
    abort ();
  /* Handle the case where WEOF is a value that does not fit in a wchar_t.  */
  if (wc == (wchar_t)wc)
    if (wctomb (buf, (wchar_t)wc) == 1)
      return (unsigned char) buf[0];
  return EOF;
}


> I would be happy to provide more details, but please tell me what do
> you want to know.
>
>> > -            (c) = wctob(wc);                      \
>> > +            uc = (unsigned) wctob(wc);            \
>> > +      (c) = uc;                           \
>>
>> If that works for you, then you must not be
>> testing with anything that would set C to \xff.
>>
>> Using that code would truncate wctob's "int" result to "char" width,
>> and thus make it impossible to distinguish between a result of 0xff and EOF.
>
> It should be easy to test whether the return value of wctob is -1, and
> only coerce the other values to unsigned char.  Would that DTRT?
>
> As I said: I'm not an expert on these issues, so perhaps I'm missing
> something.  If you could guide me what to try, I'm quite sure we will
> find a good solution to this problem.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]