[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: grep internationalization
From: |
Paul Eggert |
Subject: |
Re: grep internationalization |
Date: |
Tue, 23 Jan 2001 19:01:53 -0800 (PST) |
> From: Isamu Hasegawa <address@hidden>
> Date: Wed, 24 Jan 2001 11:28:44 +0900 (JST)
>
> However our patch can not handle Equivalence class correctly,
> because POSIX don't supply the function for equivalence class.
Sure it does: it's called 'regcomp' and 'regexec'. :-) Actually, I'm
only half kidding here, as it should be possible to use the system
regcomp and regexec to come up with a compatible matcher. It would be
some work, though.
Does your patch handle multicharacter collating elements, e.g. "aa" in
Danish?
> However our regex.c is derived from glibc, so these test
> cases also fail when I use the regex.c from glibc-2.2.
That is a bug in glibc 2.2 regex.c. One of these days we need to fix
glibc 2.2. In the meantime we need to used a patched regex.c. The
latest test version of grep checks to see whether the library regex.c
is buggy, and if so, uses grep's regex.c.
> echo '{' | grep -e '\{'
> '\{' is invalid basic regular expression, isn't it?
No, it's not invalid. It has undefined behavior, and this means that
a portable POSIX application cannot rely on '\{' producing an error.
GNU grep defines the BRE '\{' to have the same meaning as '{'; this is
compatible with traditional BSD behavior. More importantly, GNU egrep
defines the ERE '{' to have the same meaning as '\{'.
> Should we fix these inconvenience?
Yes, that sounds like a good idea.
Two more suggestions:
* Adapt your patch to the latest published test version of GNU grep,
at <ftp://alpha.gnu.org/gnu/grep/grep-2.5a.tar.gz>.
* Make sure your patch is compatible with the latest draft (Draft 5)
of the revised POSIX standard, which you can get from
<http://www.opengroup.org/austin/>. In particular, the semantics of
regular expressions have been loosened a bit in d5, which will help
you somewhat. However, I should warn you that these changes are
still a bit controversial and they are not final.