[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales
From: |
Norihiro Tanaka |
Subject: |
bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales |
Date: |
Sat, 20 Dec 2014 11:57:39 +0900 |
On Fri, 19 Dec 2014 18:31:05 -0800
Paul Eggert <address@hidden> wrote:
> If mbrlen does the right thing, grep and sed should do the right thing.
mbrlen() already does the right thing. So, perhaps, they depend on
behavior of regex. Even if so, I think that they should also be fixed
in the C library.
cat <<EOF |
#include <stdio.h>
#include <stdlib.h>
#include <wchar.h>
#include <locale.h>
int
main ()
{
setlocale (LC_ALL, "");
mbstate_t mbs = { 0 };
char s[] = { 0xED, 0xA0, 0xBF };
size_t len = mbrlen (s, 3, &mbs);
printf ("mbrlen = %d\n", len);
exit (EXIT_SUCCESS);
}
EOF
gcc -xc - && ./a.out