Re: Most used words in current buffer

help-gnu-emacs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Most used words in current buffer

From:	Eric Abrahamsen
Subject:	Re: Most used words in current buffer
Date:	Sat, 21 Jul 2018 21:00:36 -0700
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)

Eric Abrahamsen <eric@ericabrahamsen.net> writes:

> Udyant Wig <udyantw@gmail.com> writes:
>
>> On 07/21/2018 09:45 PM, Eric Abrahamsen wrote:
>>> Interesting... In general I think Emacs is highly optimized to use the
>>> buffer as its textual data structure, more so than a string.
>>> Particularly when the code is compiled (many of the text-movement
>>> commands have opcodes). I made the following two commands to collect
>>> words from a novel in an Org file, and the one that uses
>>> `forward-word' and `buffer-substring' runs around twice as fast as the
>>> `split-string'.
>>>
>>> Of course, they don't collect the same list of words! But even if you
>>> add more code for trimming, etc., it will still likely be faster than
>>> operating on a string.
>>> [snip code]
>>
>> I have acted upon the advice (yours and Stefan Monnier's) to operate on
>> the buffer directly using BUFFER-SUBSTRING.  Please see my follow up to
>> Stefan's message.
>>
>> BUFFER-SUBSTRING did gain me (somewhat) better performance.
>
> As Stefan said, going character by character is going to be slow... But
> my example with `forward-word' collects a lot of cruft. So I would
> suggest doing what `forward-word' does internally and move by syntax.

Actually I think alternating `forward-word' with `forward-to-word' might
do the exact same thing as alternating (skip-syntax-forward "w") with
(skip-syntax-forward "^w"), and might get you some extra... stuff. Maybe
worth benchmarking!

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Most used words in current buffer, (continued)

Prev by Date: Re: Most used words in current buffer
Next by Date: Re: Most used words in current buffer
Previous by thread: Re: Most used words in current buffer
Next by thread: Re: Most used words in current buffer
Index(es):
- Date
- Thread