[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: strstr, strcase, strcasestr, and i18n
From: |
Bruno Haible |
Subject: |
Re: strstr, strcase, strcasestr, and i18n |
Date: |
Sun, 4 Feb 2007 15:28:40 +0100 |
User-agent: |
KMail/1.5.4 |
Paul Eggert wrote:
> > - strstr: This function's behaviour is not clearly defined. POSIX says
> > that it compares a "string" with a "sequence of bytes". Which a priori
> > is nonsense, since the elements of strings are characters.
>
> No, elements of "character strings" are characters. Elements of "strings"
> are bytes. See:
>
> http://www.opengroup.org/susv3/basedefs/xbd_chap03.html#tag_03_92
> http://www.opengroup.org/susv3/basedefs/xbd_chap03.html#tag_03_367
It's hard to know POSIX as well as you do :-)
> So strstr's behavior is clearly defined: it operates on strings (i.e.,
> byte strings), not character strings.
Indeed. And strstr cannot be specified to consider "character strings",
without breaking backward compatibility :-(
> > It was tempting to make a clear API nomenclature: c-str* for the C locale
> > emulation, str* for the internationalized functions. But if you're right
> > with strstr, then we should find new names for the internationalized
> > versions
> > of these functions.
>
> I think we have to find new names, yes.
Yup. It appears that Microsoft did their homework regarding str* functions
and multibyte strings, while the ISO C and POSIX communities didn't. I'll be
adding the following functions to gnulib, attempting to fix the hole that
ISO C and POSIX left.
mbschr like strchr
mbsrchr like strrchr
mbsstr like strstr
mbscasecmp like strcasecmp
mbscasestr like strcasestr
mbscspn like strcspn
mbspbrk like strpbrk
mbsspn like strspn
mbstok_r like strtok_r
The prefix "mbs" coincides with the precedent "mbswidth" in gnulib and
with the precedent "mbspbrk", "mbsrchr" on HP-UX.
It does not conflict with the Microsoft names, since Microsoft uses "_mbs",
but the functions have the same calling convention as Microsoft's functions,
except that MS uses 'unsigned char *' as multibyte string type.
Bruno
- gnulib string module problems on Debian stable, maybe other platforms, Paul Eggert, 2007/02/01
- Re: gnulib string module problems on Debian stable, maybe other platforms, Jim Meyering, 2007/02/01
- Re: gnulib string module problems on Debian stable, maybe other platforms, Paul Eggert, 2007/02/01
- strstr, strcase, strcasestr, and i18n, Bruno Haible, 2007/02/01
- Re: strstr, strcase, strcasestr, and i18n, Paul Eggert, 2007/02/02
- Re: strstr, strcase, strcasestr, and i18n,
Bruno Haible <=
Re: [bug-gnulib] gnulib string module problems on Debian stable, maybe other platforms, Bruno Haible, 2007/02/01
- Re: gnulib string module problems on Debian stable, maybe other platforms, Paul Eggert, 2007/02/01
- Re: [bug-gnulib] gnulib string module problems on Debian stable, maybe other platforms, Paul Eggert, 2007/02/01
- portability checks, errors and warnings, Bruno Haible, 2007/02/01
- Re: portability checks, errors and warnings, Paul Eggert, 2007/02/02
- Re: portability checks, errors and warnings, Bruno Haible, 2007/02/02
- Re: portability checks, errors and warnings, Paul Eggert, 2007/02/04
- Re: portability checks, errors and warnings, Bruno Haible, 2007/02/04
- Re: portability checks, errors and warnings, Paul Eggert, 2007/02/04
- Re: portability checks, errors and warnings, Bruno Haible, 2007/02/17