[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 1/5] maint: ensure that MB_CUR_MAX is defined even when !MBS_
Re: [PATCH 1/5] maint: ensure that MB_CUR_MAX is defined even when !MBS_SUPPORT
Fri, 16 Sep 2011 15:12:37 +0200
Mozilla/5.0 (X11; Linux x86_64; rv:6.0.2) Gecko/20110906 Thunderbird/6.0.2
On 09/16/2011 03:03 PM, address@hidden wrote:
Please remember that dfa.[ch] are shared code with gawk and I think
also gettext (although I don't know how up to date gettext's version is).
I'd really prefer not to have too many GREP_xxx kinds of things in those
files. (It's ok in the rest of grep, of course.:-)
We could separate the variables for dfa and the rest of grep. Grep just
needs "#define DFA_MB_CUR_MAX GREP_MB_CUR_MAX" then (and you can
similarly "#define DFA_MB_CUR_MAX gawk_mb_cur_max" in gawk).
For what it's worth, MB_CUR_MAX is a function call in GLIBC. There were
some cases in gawk where I was losing a noticable amount of time calling
it a lot. So I set up a global variable gawk_mb_cur_max and initialize
it in main(), since the result should never change during a single run of
the program. It made a difference.
Interesting. We do have a field for mb_cur_max in dfaexec, but it is
there because some UTF-8 regex can be run as if the locale was single
byte. I suspect however that awk programs (especially badly written
ones!) do more regex compilation than grep, up to 1 compilation per
match. For grep it shouldn't really matter.
Having variables grep_mb_cur_max and dfa_mb_cur_max (separate for the
reasons Arnold explained) would work, but it would make it impossible
for the compiler to throw away the multibyte code when MBS_SUPPORT is zero.