bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: grep is horriby slow in UTF-8 locales


From: Danilo Segan
Subject: Re: grep is horriby slow in UTF-8 locales
Date: Fri, 07 Nov 2003 16:49:58 +0100
User-agent: Gnus/5.1002 (Gnus v5.10.2) Emacs/21.3.50 (gnu/linux)

Markus Kuhn <address@hidden> writes:

> $ grep --version
> grep (GNU grep) 2.5.1

This doesn't happen with:

$ grep --version
grep (GNU grep) 2.4.2
$ LC_ALL=POSIX time grep XYZ test.txt 
Command exited with non-zero status 1
0.03user 0.07system 0:00.36elapsed 27%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (118major+25minor)pagefaults 0swaps
$ LC_ALL=sr_CS.UTF-8 time grep XYZ test.txt 
Command exited with non-zero status 1
0.06user 0.05system 0:00.10elapsed 105%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (143major+50minor)pagefaults 0swaps
$ LC_ALL=en_GB.UTF-8 time grep XYZ test.txt 
Command exited with non-zero status 1
0.06user 0.04system 0:00.15elapsed 64%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (128major+48minor)pagefaults 0swaps
$ LC_ALL=POSIX time grep XYZ test.txt 
Command exited with non-zero status 1
0.04user 0.06system 0:00.10elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (118major+25minor)pagefaults 0swaps

Last example shows that CPU usage is not really any kind of rule to
base conculsions on (sr_CS.UTF-8 is my everyday locale, and I would
really notice if grep had any problems with it).

test.txt was produced with:
 for i in 1 2 3 4 5 6 7 8 9 0; do cat UnicodeData.txt >>test.txt; done

I can get a newer grep today, if you think I may experience different
results with it.

Cheers,
Danilo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]