bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: grep is horriby slow in UTF-8 locales


From: Glenn Maynard
Subject: Re: grep is horriby slow in UTF-8 locales
Date: Sat, 8 Nov 2003 15:58:15 -0500
User-agent: Mutt/1.5.4i

On Fri, Nov 07, 2003 at 04:49:58PM +0100, Danilo Segan wrote:
> This doesn't happen with:
> 
> $ grep --version
> grep (GNU grep) 2.4.2

This was probably before full multibyte support was added to grep; the
issue here specifically only happens in multibyte encodings.  (My grep
is slow in en_US.UTF-8, and fast in en_US.ISO-8859-1.) Try:

# echo tést | grep 't.st'
tést
# echo tést | grep 't[aé]st'
tést

> $ LC_ALL=POSIX time grep XYZ test.txt 
> Command exited with non-zero status 1
> 0.04user 0.06system 0:00.10elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (118major+25minor)pagefaults 0swaps
> 
> Last example shows that CPU usage is not really any kind of rule to
> base conculsions on (sr_CS.UTF-8 is my everyday locale, and I would
> really notice if grep had any problems with it).

The field you should be reading is "user".  "CPU" is roughly
(user+system)/elapsed, and isn't very relevant here.

-- 
Glenn Maynard




reply via email to

[Prev in Thread] Current Thread [Next in Thread]