emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Clean-up of forward-paragraph [Re: Beginingless paragraphs: second s


From: Alan Mackenzie
Subject: Re: Clean-up of forward-paragraph [Re: Beginingless paragraphs: second stab at a patch.]
Date: Fri, 21 Oct 2005 20:09:52 +0000 (GMT)

On Fri, 21 Oct 2005, Richard M. Stallman wrote:

>    The current implementation doesn't test for paragraph-s\(tart\|eparate\)
>    on the same line as the fill-prefix.  Should it?

>I think it is important to see what past versions of Emacs did--for
>instance, Emacs 19, before the support for a left margin was added.

>If it was always done this way, then I think we should document it
>clearly and not change it.  There are no users asking for changes in
>this, and changing it would be risky.  However, if the past behavior
>was confused or conflicting, we need to figure out which past behavior
>to be compatible with.

I think we are agreed, here:  (i) The current implementation of
forward-paragraph doesn't test for p-start/separate on the same lines as
fill-prefixes; (ii) No users are clamouring for this facility; (iii) Any
major modes which need this sort of thing can do so be setting
p-start/separate appropriately (as CC Mode does).  Let's leave it the way
it already is!  ;-)

>Regarding your proposed definition of paragraphs, I am concerned about
>possible incompatibilities.

I'd like to stress I'm NOT trying to change the definition of paragraphs,
merely to formulate the existing definition, which is to some extent embodied
in forward-paragraph rather than being totally explicit.

>In the "new" cases, those of use-hard-newlines and nonempty left margin,
>we are not particularly bound by compatibility.  However, in the other
>cases we are.

>    (iv) A @dfn{divider (line)} is a line which is either a separator or
>      has the fill-prefix (after any left margin) and is otherwise only
>      whitespace.  [This definition only applies when the fill-prefix is
>      non-null.]

>I think that together with (vii) are very hard to understand.

By "divider line" I was trying to say "a separator line when there's a
fill-prefix".  I didn't make a good job of it.  Sorry.  I've revised this
formulation extensively, removing this confusing term.  See below.

>    (vi) When `use-hard-newlines' is non-nil, all paragraph boundaries
>      are at hard BOLs.  A paragraph starts at a non-separator line, and
>      ends at the next hard BOL.  Here, fill-prefix and paragraph-start
>      are ignored.

>Does this make some unstated assumption about how separator lines and
>hard newlines relate to each other?  Perhaps it is just that the text
>is confusing.

The existing code tests `use-hard-newlines', which it considers
equivalent to Longlines Mode being enabled.  I don't think there're any
hidden assumptions there.  Merely that, conceptually, only hard newlines
are "real" newlines, since soft newlines are as fickle as SCO lawyers,
shifting around hither and thither as the text changes.  Thus, the only
meaningful place to look for a separator is just after a hard newline.

Is there any meaning for `use-hard-newlines' other than "Longlines Mode
is enabled"?

>    (ix) If there happens to be a blank line before a paragraph start,
>      this line is NOT regarded as being part of the paragraph.  [This is
>      the problem which was at the heart of this thread.]

>I am not quite sure what that means in concrete terms.  As I said
>before, that blank line MUST be part of the following paragraph.
>That is essential for compatibility.

The problem which started me off on all this was that of a blank line
belonging to two paragraphs, as in the following file:

------------------------------------------------------
1st Line        [starter]
asdf

1st Line        [starter]
asdf
-
Local Variables:
paragraph-separate: "-"
paragraph-start: "1st Line\\|-"
End:
-----------------------------------------------------

I think I understand now what's going on.  In the Emacs manual (page
"Paragraphs") is:

    When you wish to operate on a paragraph, you can use the command
    `M-h' (`mark-paragraph') to set the region around it. .... If there
    are blank lines preceding the first line of the paragraph, one of
    these blank lines is included in the region.

The idea here is that you can do M-h C-w to kill a paragraph, move
somewhere else with M-{ and M-}, then insert it again with C-y.  All this
without having the hassle of manually deleting/inserting blank lines.

This has been implemented as (forward-paragraph -1) moving to the blank
line.  This is a misfeature, IMAO, no matter how long it may have been
so.  Surely `mark-paragraph' should be doing the job of including this
blank line, not forward-paragraph.  This blank line is NOT itself part of
the paragraph.

[Suggestions for Emacs 23:  "blank line" in the above should be
generalised here to "separator line".  We should make the definition of
paragraph-separate explicitly state that it matches AT MOST a single
line, so that separators can be found reliably whilst searching
backwards.  forward/backward-paragraph should be supplemented by (or even
superseded by) beginning/end-of-paragraph, which would work like
b/e-of-defun.  The "blank line preceding the paragraph" should be moved
into `mark-paragraph'.]

I discovered this whilst writing @dfn{Paragraphs} in Elisp's
searching.texi.  I wanted to write "Paragraphs don't overlap.", and felt
constrained to qualify it with "@footnote{In certain obscure
circumstances it is possible for a blank line to be both the last line of
one paragraph and the first line of the next.}".  I now think I should
just ignore this obscure case in the documentation, and fix the code
somehow and sometime for Emacs 23.

>I think that at present we should probably stick to fixing anything
>which is most obviously a bug.  For instance, all paragraph beginnings
>and ends should be at BOL; when it fails to do that, that is worth
>fixing.  Bigger changes should wait for after the release.

OK.  There are several bugs in forward-paragraph.  I will fix them.  The
easiest way to fix them is with a thoroughgoing refactoring of the
function (which I have already done).  I suspect, though, you will
prefer the basic structure of the code to be left unchanged.  Please
confirm or deny this!

Here is my formulation of paragraph boundaries, thorougly revised and
incorporating the comments you've made:

Note:  Items enclosed in braces are purely for clarification.
#########################################################################
DEFINITIONS:
(i) A @dfn{hard BOL} is the beginning of a line following a hard newline.
(ii) A @dfn{separator (line)} is a line which separates paragraphs without
  being part of a one.
(iii) A @dfn{starter (line)} is a line which, when present, always begins a
  paragraph.  {Note that not every paragraph need begin with a starter.}
(iv) {A line can not be both a starter and a separator.}

SPECIAL HANDLING OF A PRECEDING BLANK LINE:
(v) In all of the following, if there should happen to be a blank line
  immediately preceding the beginning of a paragraph, this beginning will be
  modified to include the blank line.  A "blank line" here is one which
  contains only whitespace, and no more than a left margin's worth of it.

SPECIAL STUFF ABOUT BOB/EOB:
(vi) The beginning and end of (the accessible portion of) the buffer always
  count as paragraph boundaries.  [From this point on, "ALL paragraph
  boundaries" disregards BOB and EOB.]
(vii) All other paragraph boundaries are at BOL, even when there is a left
  margin.  {This is so that M-h C-w will always grab complete paragraphs.}

CORE OF PARAGRAPH DEFINITION:
(viii) A paragraph starts either at a starter, or at a line which isn't a
  separator, yet follows one.
(ix) A paragraph ends at a starter or a separator line.

WHEN use-hard-newlines IS NON-NIL {"Longlines Mode"}:
(x) {All paragraph boundaries are at hard BOLs.  A single "long line" is
  regarded as a paragraph.}
(xi) A separator is a line at a hard BOL, and which matches
  paragraph-separate at its left-margin.
(xii) A starter is any line at a hard BOL which isn't a separator.
(xiii) {fill-prefix and paragraph-start play no role here.}

OTHERWISE, WHEN fill-prefix IS NON-NULL:
(xiv) A separator is a line which either:
  (a) matches paragraph-separate at its left margin; or
  (b) contains a valid fill-prefix and is otherwise blank (WS is allowed).
(xv) A starter is a line which isn't a separator and lacks a valid
  fill-prefix.
(xvi) {paragraph-start plays no role, here.}

OTHERWISE, (use-hard-newlines AND fill-prefix ARE BOTH NULL):
(xvii) A separator is a line which matches paragraph-separate at its left
  margin.
(xviii) A starter is a line which isn't a separator and matches
  paragraph-start at its left margin.
#########################################################################

-- 
Alan Mackenzie (Munich, Germany)






reply via email to

[Prev in Thread] Current Thread [Next in Thread]