Re: grep is horriby slow in UTF-8 locales

bug-gnu-utils

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: grep is horriby slow in UTF-8 locales

From:	Glenn Maynard
Subject:	Re: grep is horriby slow in UTF-8 locales
Date:	Sat, 8 Nov 2003 15:58:15 -0500
User-agent:	Mutt/1.5.4i

On Fri, Nov 07, 2003 at 04:49:58PM +0100, Danilo Segan wrote:
> This doesn't happen with:
> 
> $ grep --version
> grep (GNU grep) 2.4.2

This was probably before full multibyte support was added to grep; the
issue here specifically only happens in multibyte encodings.  (My grep
is slow in en_US.UTF-8, and fast in en_US.ISO-8859-1.) Try:

# echo tést | grep 't.st'
tést
# echo tést | grep 't[aé]st'
tést

> $ LC_ALL=POSIX time grep XYZ test.txt 
> Command exited with non-zero status 1
> 0.04user 0.06system 0:00.10elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (118major+25minor)pagefaults 0swaps
> 
> Last example shows that CPU usage is not really any kind of rule to
> base conculsions on (sr_CS.UTF-8 is my everyday locale, and I would
> really notice if grep had any problems with it).

The field you should be reading is "user".  "CPU" is roughly
(user+system)/elapsed, and isn't very relevant here.

-- 
Glenn Maynard

[Prev in Thread]

Current Thread

[Next in Thread]

grep is horriby slow in UTF-8 locales, Markus Kuhn, 2003/11/07
- Re: grep is horriby slow in UTF-8 locales, Bill Rugolsky Jr., 2003/11/07
- Re: grep is horriby slow in UTF-8 locales, Owen Taylor, 2003/11/07
- Re: grep is horriby slow in UTF-8 locales, Danilo Segan, 2003/11/07
  - Re: grep is horriby slow in UTF-8 locales, Glenn Maynard <=
- Re: grep is horriby slow in UTF-8 locales, Glenn Maynard, 2003/11/07

Prev by Date: Re: sed segfaults
Next by Date: Re: sed segfaults
Previous by thread: Re: grep is horriby slow in UTF-8 locales
Next by thread: Re: grep is horriby slow in UTF-8 locales
Index(es):
- Date
- Thread