emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Line wrap reconsidered


From: Eli Zaretskii
Subject: Re: Line wrap reconsidered
Date: Thu, 28 May 2020 21:05:05 +0300

> From: Yuan Fu <casouri@gmail.com>
> Date: Thu, 28 May 2020 13:31:37 -0400
> Cc: Lars Ingebrigtsen <larsi@gnus.org>,
>  emacs-devel@gnu.org
> 
> Before answering your questions, this is my understanding of the word 
> wrapping in redisplay:
> On every iteration we check if current character allow wrapping after it, if 
> so, we set may_wrap to true.

Yes.

> That basically means the _previous char_ allows wrapping after it.

No, we never wrap at the character where we set may_wrap = true, we
wrap _after_ it.  In the current code, may_wrap is set when we see a
SPC character, and when we wrap, that SPC is left on the line before
the wrap.

> While we are at the same iteration, we may want to wrapping point (wrap_it) 
> if 1) the previous char allows wrapping after it (from may_wrap’s value set 
> by _previous iteration_) and 2) the current character allows wrapping before 
> it.

Yes.  But I don't understand what you mean by "at the same iteration"
here: if may_wrap was set, then we will not try to save the wrap point
until we process the next character.  So I cannot call this "the same
iteration", it's rather "the next iteration".

> When we found ourselves at the end of a line

You mean, when we reach the edge of the window, right?

> we have two choices: continue to the next line (which I assume is what 
> “continue” means in the comment), or instead go to a previously saved wrap 
> point and break the line there.

Which "continue" are you alluding to here?  Do you mean "lines are
continued"?  because if lines are not being wrapped, we have only one
choice: go back to the last wrap point we found, end the screen line
(a.k.a. "break the line") there, and continue with the text on the
next screen line.

> If there is no previous wrap point, we continue, if there is a previous wrap 
> point but we actually can wrap at this point (previous char can wrap after & 
> this char can wrap before), we just continue, since this position is a valid 
> wrap position. Otherwise we go back to previous wrap point and wrap there 
> (goto back_to_wrap;).

You lost me here.  The logic is actually: if there is a wrap point, go
back to it and break the line there; if there's no wrap point, break
the line where we are now (i.e. at the last character that still fits
inside the window). 

> Although the original code only has one checker (IS_WHITESPACE), it serves a 
> dual purpose: it is used to determine if we can wrap after—whitespace and tab 
> allow wrapping after; it is also used to determine if we can wrap before—they 
> don’t allow wrapping before (otherwise you see whitespace and tabs on the 
> beginning of the next line).

That is true.  I suggested to have a single function so that you could
in that single function perform the same test, just using two
different categories.  then you could basically keep the rest of the
logic intact.  If that is somehow not possible, can you explain why?

> Needless to say, I’m a newbie in Emacs C internals and redisplay, so my 
> understanding from reading the original code and comments could be wrong. But 
> the code seems to work right so I think the truth isn’t far away.

The truth isn't far away, but we need to go all the way so that the
result doesn't break some use cases (of which there are a gazillion in
the display code).

> > Do we really need two separate functions?  And note that each one
> > calls IT_DISPLAYING_WHITESPACE, so in some situations you will be
> > testing that twice for the same character -- because you have 2
> > separate functions.
> 
> IT_DISPLAYING_WHITESPACE would run twice, too: one time to check for warp 
> after and one time for wrap before.

My point was to bring the two tests together so that it could be just
one test.  Is that possible?  If not, why not?

> >> +        if (may_wrap && char_can_wrap_before(it))
> >>            {
> >>              /* We have reached a glyph that follows one or more
> >> -               whitespace characters.  If the position is
> >> -               already found, we are done.  */
> >> +               whitespace characters (or a character that allows
> >> +               wrapping after it).  If the position is already
> >> +               found, we are done.  */
> > 
> > The code calls char_can_wrap_before, but the comment says we can wrap
> > after it.  Which one is right?
> 
> The comment says this char _follows_ a char that allows wrapping after, we 
> still need to check if _this_ char allows wrapping before.

But the comment is after the char_can_wrap_before test, so we have
already tested that, no?

> >> -                        /* If we are at a whitespace character
> >> -                           that barely fits on this screen line,
> >> -                           but the next character is also
> >> -                           whitespace, we cannot wrap here.  */
> >> +                        /* If the previous character says we can
> >> +                                 wrap after it, but the current
> >> +                                 character says we can't wrap before
> >> +                                 it, then we can't wrap here.  */
> > 
> > It sounds like your Emacs is set up to use only spaces for indentation
> > in C source files, whereas our convention is to use tabs and spaces.
> 
> I’m not sure what you mean, do you mean the indent style for the new code, or 
> are you talking about the word wrapping?

I mean the indentation: the original code used TABs and spaces, but
your code uses only spaces.

> > And finally, one more general comment/question: isn't your code assume
> > implicitly that buffer text is always processed in the logical order,
> > i.e. in the increasing order of buffer positions?  I mean, the fact
> > that you have the "before" and the "after" function seems to imply
> > that you do assume that, and the logic of processing the categories is
> > relying on that, expecting that when you see a wrap_after character,
> > you can wrap on the next character.  Is this so?  because if it is,
> > then this will break when processing RTL text, since we may process it
> > in the reverse order of buffer positions.  Please look into these
> > situations and see that the code does TRT in them.
> 
> I tested with some Alaric text made by google translate, and the wrapping 
> seems not take effect. IIUC bidi.c reorders the line to RTL

Yes.

> and the redisplay iterator will still go through them LTR

Not sure what you mean by "go through them LTR".  The iterator can
move forward or backward, or even jump to a far-away place.  You
cannot assume that the next character examined will be the next
character in the buffer.

> I wonder how does the original wrapping works in that case. Does the old code 
> handle bidi text?

Of course it does, you can easily see that if you run the unmodified
Emacs.  It would be a terrible misfeature if we didn't handle wrappng
correctly in bidirectional scripts.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]