[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Slow operations on buffers of tens of megabytes
From: |
Reiner Steib |
Subject: |
Re: Slow operations on buffers of tens of megabytes |
Date: |
Mon, 06 Nov 2006 10:21:39 +0100 |
User-agent: |
Gnus/5.110006 (No Gnus v0.6) Emacs/22.0.90 (gnu/linux) |
On Mon, Nov 06 2006, Katsumi Yamaoka wrote:
>>>>>> In <address@hidden> Richard Stallman wrote:
>
>> Scoring of the messages closer to the beginning of the buffer is fast,
>> but as we move to higher-numbered messages, that are closer to the end
>> of such big files/buffers, gnus will only score 2-3 messages per
>> minute, and that's what kills performance.
[...]
> (setq gnus-article-button-face nil
> gnus-signature-face nil
> gnus-summary-selected-face nil
> gnus-treat-highlight-citation nil
> gnus-treat-emphasize nil)
>
> If it makes Gnus fast, improving the performance will be worth
> trying. However, I didn't feel any difference, though it might
> be because I don't have huge mail folders.
I don't think this matches the problem description. When scanning big
mbox files, article display isn't involved. Or am I missing
something?
My guess is that it's problem with case-fold-search when searching for
"X-Gnus-Article-Number" in mbox files in Emacs 22 as analyzed by Elias
Oltmanns back in June:
,----[ http://thread.gmane.org/gmane.emacs.devel/53901/focus=54013 ]
| From: Elias Oltmanns <oltmanns <at> uni-bonn.de>
| Subject: Re: New buffer-case-table makes search_buffer painfully slow
| Newsgroups: gmane.emacs.devel
| Date: 2006-05-06 19:10:08 GMT
|
| Elias Oltmanns <oltmanns <at> uni-bonn.de> wrote:
| > Hi all,
| >
| > switching from emacs 21 to emacs 22 has a very significant performance
| > impact on packages that make heavy use of search_buffer. An example
| > that actually made me aware of this problem is gnus processing large
| > mbox files. Further analysis of this problem revealed that in emacs 22
| > an "i" in the search string makes search_buffer use simple_search()
| > instead of boyer_moore().
|
| Emacs 22's EQUIVALENCES table relates i, and thus I as well, to two
| more characters with character codes 331857 and 331856. On
| www.unicode.org the character look up engine couldn't find a match for
| U+51051 or U+51050 saying that most likely those codes weren't
| assigned to any characters yet.
|
| So, here is a plain question: Is there a bug in the case-table in
| emacs 22 or does the search engine on www.unicode.org for some reason
| miss certain character ranges? Slightly biassed, I'm disregarding the
| possibility of me being unable to use www.unicode.org properly, which,
| in fact, might well be the reason for my confusion.
|
| Second question: If the case-table was right, what would be the right
| way to tacle the problem described in my original post? For me the
| following snippet in .emacs solves the problem:
| --- ~/.emacs ---
| (unless (< emacs-major-version 22)
| (set-case-syntax 331856 "w" (standard-case-table))
| (set-case-syntax 331857 "w" (standard-case-table)))
| --- ~/.emacs ---
|
| This, of course, is a durty hack and I'm wondering whether emacs
| should provide a feature to "clean up" the EQUIVALENCES table in the
| ascii range in order to avoid falling back to a slow search
| algorithm when we are searching for pure ascii strings. Or do you
| think that packages like gnus which make heavy use of
| re-search-forward should handle these performance issues
| themselves---or indeed the users.
`----
Alexandre, could you please try if the hack suggested by Elias makes
your problem go away?
Richard proposed a fix for this, but AFAICS, this has not been
implemented:
,----[ http://thread.gmane.org/gmane.emacs.devel/53901/focus=54025 ]
| From: Richard Stallman <rms <at> gnu.org>
| Subject: Re: New buffer-case-table makes search_buffer painfully slow
| Newsgroups: gmane.emacs.devel
| Date: 2006-05-07 05:01:27 GMT
|
| I think this has to do with the special characters for Turkish,
| lower-case i without dot and upper-case I with dot. In Turkish,
| upcasing and downcasing preserve the dot, or the absence of the dot.
|
| I think these lines in characters.el are the cause of the problem.
|
| (set-downcase-syntax ?? ?i tbl)
| (set-upcase-syntax ?I ?? tbl)
|
| They set up only half of what Turkish needs.
| They make dotless-i upcase into I, and they make
| I-with-dot downcase into i. They can't do vice versa
| because that would break things for other languages.
| So they are not really useful. We could simply delete them.
|
| We could also add a minor mode to set up the case table all the way
| for Turkish.
|
| Would someone like to do that?
`----
Looking at the ChangeLog, it seems that the relevant code in
`characters.el' ...
,----[ international/characters.el ]
| ;; In some languages, U+0049 LATIN CAPITAL LETTER I and U+0131 LATIN
| ;; SMALL LETTER DOTLESS I make a case pair, and so do U+0130 LATIN
| ;; CAPITAL LETTER I WITH DOT ABOVE and U+0069 LATIN SMALL LETTER I.
| ;; Thus we have to check language-environment to handle casing
| ;; correctly. Currently only I<->i is available.
| [...]
| (set-downcase-syntax ?İ ?i tbl)
| (set-upcase-syntax ?I ?ı tbl)
`----
... has been changed back and forth several times:
,----[ ChangeLog ]
| 2005-04-01 Kenichi Handa <address@hidden>
|
| * international/characters.el: Enable the correct case setting for
| dotless-i and dotted-I.
|
| 2005-02-02 Kenichi Handa <address@hidden>
|
| * international/characters.el: Cancel previous change for
| I-WITH-DOT-ABOVE and DOTLESS-i.
|
| 2005-02-02 Kenichi Handa <address@hidden>
|
| * international/latin-5.el (tbl): Setup cases of I-WITH-DOT-ABOVE,
| DOTLESS-i.
|
| * international/characters.el: Setup cases of GREEK-FINAL-SIGMA,
| Y-WITH-DIAERESIS, I-WITH-DOT-ABOVE, DOTLESS-i.
`----
Bye, Reiner.
--
,,,
(o o)
---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/
- Slow operations on buffers of tens of megabytes, Alexandre Oliva, 2006/11/05
- Re: Slow operations on buffers of tens of megabytes, Richard Stallman, 2006/11/06
- Re: Slow operations on buffers of tens of megabytes, Katsumi Yamaoka, 2006/11/06
- Re: Slow operations on buffers of tens of megabytes,
Reiner Steib <=
- Re: Slow operations on buffers of tens of megabytes, Alexandre Oliva, 2006/11/06
- Re: Slow operations on buffers of tens of megabytes, Reiner Steib, 2006/11/07
- Re: Slow operations on buffers of tens of megabytes, Reiner Steib, 2006/11/08
- Re: Slow operations on buffers of tens of megabytes, Alexandre Oliva, 2006/11/09
- Re: Slow operations on buffers of tens of megabytes, Richard Stallman, 2006/11/10
- Re: Slow operations on buffers of tens of megabytes, Reiner Steib, 2006/11/10
- Re: Slow operations on buffers of tens of megabytes, Kevin Rodgers, 2006/11/13
- Re: Slow operations on buffers of tens of megabytes, Richard Stallman, 2006/11/14
- Re: Slow operations on buffers of tens of megabytes, Reiner Steib, 2006/11/13
- Re: Slow operations on buffers of tens of megabytes, Elias Oltmanns, 2006/11/19