bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x)

bug-grep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x)

From:	Pádraig Brady
Subject:	bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales
Date:	Sat, 11 Jan 2014 01:49:30 +0000
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

Cool so it does this transformation:

  sed 's/./[\L&\U&]/g'

Though multi byte case handling has all sorts of edge cases (pardon the pun),
and it may not be always valid to treat each character independently?
For example see some of the tests in:
http://git.sv.gnu.org/gitweb/?p=gnulib.git;a=blob;f=tests/unicase/test-ulc-casecmp.c;hb=HEAD

I wonder might this faster path be restricted to a safer but very common input 
subset of:

(MB_CUR_MAX == 1 || (in_utf8 && *c < 0x80))

Also are the following printfs in the test redundant?

> +data=$(      printf "I:$I $i:i")
> +search_str=$(printf "$i:i I:$I")

nice improvement!
Pádraig.

[Prev in Thread]

Current Thread

[Next in Thread]

bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Jim Meyering, 2014/01/07
- bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Jim Meyering, 2014/01/10
- bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Pádraig Brady <=
  - bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Jim Meyering, 2014/01/10
    - bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Jim Meyering, 2014/01/11
    - bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Pádraig Brady, 2014/01/11
    - bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Pádraig Brady, 2014/01/11
    - bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Jim Meyering, 2014/01/11
    - bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Jim Meyering, 2014/01/11
    - bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Pádraig Brady, 2014/01/12

Prev by Date: bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales
Next by Date: bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales
Previous by thread: bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales
Next by thread: bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales
Index(es):
- Date
- Thread