[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#2887: Suggestions for simple.el

From: Stefan Monnier
Subject: bug#2887: Suggestions for simple.el
Date: Tue, 07 Apr 2009 10:02:07 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.92 (gnu/linux)

> The default behavior of practically all editors is to delete words
> without altering the clipboard. The principle of least surprise says
> Emacs should at least make it easy for users to bind C-backspace to
> `backward-delete-word' and C-delete to `delete-word', if that's the
> behavior they prefer.

But do those editors offer the equivalent of M-y?

> When I'm editing a 212MB text file, many things become sluggish, but
> pos-at-*-of-line is not one of them:
>   M-x benchmark (pos-at-end-of-line 1000000)    ; 0.157s

This timing means that if a command calls this function for lines near
point, that command becomes sluggish when you get to the end of the
buffer.  If it calls it 100 times, the command becomes
completely unusable.

> This speed is similar to line-*-position:
>   M-x benchmark (line-end-position 1000000)    ; 0.172s
>   10 M-x benchmark (line-end-position 1000000) ; 0.734s

Of course.  But line-end-position is usually called with small args
because it's a relative position.

> Here is a candidate upgrade of `delete-trailing-whitespace':

Please send it as a patch so we can see what's changed rather than only
see the end result.

> This performs 4x faster than the first candidate upgrade, because the
> regexp matches only 3 characters, whereas "\\s-+$" matches 147 different
> characters in text-mode.

Actually, it isn't quite for that reason, it's much worse: \s- is
a regexp that may potentially match *any* character (depending on the
syntax-table text-property), so contrary to [ \t\r] which can use
a "fastmap" to quickly skip over irrelevant chars, \s- has to spend
a lot more time on each char.

> The syntax table definitions of whitespace can be confusing, e.g. the
> ^M glyph is considered whitespace in text-mode but not in
> emacs-lisp-mode...


> After this explanatory introduction, my real proposal is to define a
> variable to determine the behavior of `delete-trailing-whitespace':

> (defvar trailing-whitespace-regexp "\\s-+$"
>   "Regular expression describing what `delete-trailing-whitespace'
> should delete. Note that regardless of the value of
> `trailing-whitespace-regexp', `delete-trailing-whitespace' will never
> delete formfeed and newline characters.

> The default value \"\\\\s-+$\" uses the current syntax table definitions
> of whitespace, but an expression like \"[ \\r\\t]+$\" is faster and
> consistent across modes.")

I think [ \t\r] is a good default, and if we introduce a config var
(which I'm not sure is worth the trouble), there's no reason to keep the
special treatment of formfeed.

>       (let ((count 0)(table (syntax-table)))
>         (modify-syntax-entry ?\f "." table) ; never delete formfeeds
>         (modify-syntax-entry ?\n "." table) ; never delete newline
>         (with-syntax-table table
>           (while (re-search-forward whitespace-regexp nil t)
>             (replace-match "")(setq count (1+ count)))
>           (message "Cleaned %d lines" count))))))

This modifies the current syntax-table (because `syntax-table' returns
the table itself, not a copy).

> Good point. Here are some undefined keystrokes in Emacs 22.3.1 that seem
> to get through:

>   C-x j    backward-delete-word
>   C-x C-j  delete-word
>   C-x x    kill-line-or-region
>   M-n      pull-line-down
>   M-p      pull-line-up
>   C-M-z    zap-back-to-char
>   C-M-m    zap-up-to-char
>   C-x C-a  delete-all-blank-lines
>   M-&      delete-indentation-nospace
>   C-x w    goto-longest-line
>   C-x y    downcase-word-or-region
>   C-x C-y  upcase-word-or-region

> Thank you for your patience and thoughtful responses,

I think I will prefer to leave those unbound for now, waiting for more
generally useful commands, or more general agreement that they are
generally useful.  Note that for some of them (e.g. kill-line-or-region
or downcase-word-or-region), you might want to try and argue that their
functionality should simply be folded into the kill-line
resp. downcase-word (which might let us free up C-x C-u and C-x C-l, and
maybe even C-w).


reply via email to

[Prev in Thread] Current Thread [Next in Thread]