[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: new module 'c-strcase'
From: |
Bruno Haible |
Subject: |
Re: new module 'c-strcase' |
Date: |
Tue, 11 Oct 2005 14:44:45 +0200 |
User-agent: |
KMail/1.5 |
Paul Eggert wrote:
> > More precisely, one of the string arguments must be an ASCII string;
> > the other one can also contain non-ASCII characters (but then the
> > comparison result will be nonzero).
>
> Why is this restriction needed?
It is needed to guarantee that the result is equivalent to the comparison
result in the C locale. On a system where the C locale has UTF-8 encoding,
c_strcasecmp ("François", "FRANÇOIS") != 0
although
setlocale (LC_ALL, "C");
strcasecmp ("François", "FRANÇOIS") == 0.
> Doesn't the code simply
> compare bytes after converting 'A'-'Z' to 'a'-'z'? In that case,
> it is not really required that one argument must be an ASCII string;
> both strings can be non-ASCII but the result is still well-defined.
The result is then well-defined but not related to the behaviour of
the C locale on such systems, and the name of the module would be a
misnomer :-)
> > return c1 - c2;
>
> A nit: in theory this could result in integer overflow.
> The following would be portable to machines where char == int.
>
> return UCHAR_MAX <= INT_MAX ? c1 - c2 : c1 < c2 ? -1 : c1 > c2;
>
> Such machines do exist. They are unlikely targets for big GNU
> apps but are potential targets for this module.
OK, fixed. But just for info, what are these machines? The 10-year old
CRAY ?
Bruno
2005-10-11 Bruno Haible <address@hidden>
* strcasecmp.c: Include limits.h.
(strcasecmp): Avoid integer overflow on exotic platforms.
* strncasecmp.c: Include limits.h.
(strncasecmp): Avoid integer overflow on exotic platforms.
Reported by Paul Eggert.
diff -c -3 -r1.10 strcasecmp.c
*** strcasecmp.c 17 Aug 2005 14:01:07 -0000 1.10
--- strcasecmp.c 11 Oct 2005 12:47:19 -0000
***************
*** 25,30 ****
--- 25,31 ----
#include "strcase.h"
#include <ctype.h>
+ #include <limits.h>
#if HAVE_MBRTOWC
# include "mbuiter.h"
***************
*** 93,98 ****
}
while (c1 == c2);
! return c1 - c2;
}
}
--- 94,105 ----
}
while (c1 == c2);
! if (UCHAR_MAX <= INT_MAX)
! return c1 - c2;
! else
! /* On machines where 'char' and 'int' are types of the same size, the
! difference of two 'unsigned char' values - including the sign bit -
! doesn't fit in an 'int'. */
! return (c1 > c2 ? 1 : c1 < c2 ? -1 : 0);
}
}
diff -c -3 -r1.6 strncasecmp.c
*** strncasecmp.c 19 Sep 2005 17:28:15 -0000 1.6
--- strncasecmp.c 11 Oct 2005 12:47:19 -0000
***************
*** 1,5 ****
/* strncasecmp.c -- case insensitive string comparator
! Copyright (C) 1998, 1999 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
--- 1,5 ----
/* strncasecmp.c -- case insensitive string comparator
! Copyright (C) 1998, 1999, 2005 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
***************
*** 23,28 ****
--- 23,29 ----
#include "strcase.h"
#include <ctype.h>
+ #include <limits.h>
#define TOLOWER(Ch) (isupper (Ch) ? tolower (Ch) : (Ch))
***************
*** 54,58 ****
}
while (c1 == c2);
! return c1 - c2;
}
--- 55,65 ----
}
while (c1 == c2);
! if (UCHAR_MAX <= INT_MAX)
! return c1 - c2;
! else
! /* On machines where 'char' and 'int' are types of the same size, the
! difference of two 'unsigned char' values - including the sign bit -
! doesn't fit in an 'int'. */
! return (c1 > c2 ? 1 : c1 < c2 ? -1 : 0);
}