Re: Bug #25608 and the comment-cache branch

emacs-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Bug #25608 and the comment-cache branch

From:	Alan Mackenzie
Subject:	Re: Bug #25608 and the comment-cache branch
Date:	Tue, 14 Feb 2017 21:14:23 +0000
User-agent:	Mutt/1.7.2 (2016-11-26)
Hello, Dmitry.

On Tue, Feb 14, 2017 at 17:28:58 +0200, Dmitry Gutov wrote:
> Hi Alan,

> On 07.02.2017 21:21, Alan Mackenzie wrote:

> >> How come the "alternative patch" works well, then?

> > Well, aside from the fact that it doesn't (IMAO), it is only consulted
> > relatively rarely, in certain cases of back_coment where the backward
> > scanning hits something it doesn't want to handle.

> What is "it"?

Respectively, the "alternative patch", the syntax-ppss cache mechanism,
and the backward scanning, in the three uses in that sentence.  Sorry it
wasn't clear.

> I would imagine that to be sure that point is not e.g.  inside a
> string, the patch would have to consult the cache (or call syntax-ppss)
> at least once per forward-comment call.

Indeed.

>  From there, I don't really see a real need for backward comment 
> scanning. So if you rewrote some code to use forward scanning, that 
> approach should be applicable on top of the AP as well.

The back_comment function needs to use backward scanning unless it has a
robust enough and fast enough cache giving it the literality at any
point.

[ .... ]

> More importantly, I want to keep as much logic in Lisp as feasible, 
> which is the currently recommended approach anyway.

I sometimes think you might be trying to keep more in Lisp than is
feasible.

> Problems like this could be solved in different ways without avoiding 
> that goal. We can provide new faster primitives if manipulating some 
> data structure in Lisp is not enough (but we need benchmarks first, and 
> so far, speed is not a problem). We can also add new hooks if 
> before-change-functions is not up to snuff.

In other words, implementing new logic in C.  One thing which cannot be
done in lisp (without new facilities in C) is invalidating the cache when
syntax-table text properties are applied and removed (which is always
done when the change hooks are inhibited).  You can do it directly in C,
you can write new facilities in C to allow it to be done in lisp, or you
can pretend it doesn't need doing.  comment-cache takes the first
approach, syntax-ppss takes the last at the moment.

> >> Tracking the used syntax table is also a problem which we need to solve
> >> for syntax-ppss. A good design could handle it and narrowing together.

> > You should now be able to see why I dislike syntax-ppss so much.  As
> > well as being incompatible with narrowing (which should be sort of
> > fixable), there is an essential lack of cache invalidating (which would
> > only be fixable by a radically different design).

> Why wouldn't it be fixable with a moderate change in design? The problem 
> you are referring to (which is almost entirely theoretical at this 
> point, in the absence of bug reports) ....

Here I disagree with you and Stephan profoundly - A flaw is a flaw
whether or not it has yet provoked a bug report.  And just because
something hasn't yet had a bug report on it doesn't mean it's OK.  If we
see a way non-rigorous coding _can_ lead to a bug, then that is a bug and
needs fixing, particularly if it's in a primitive.

> .... is caused by syntax-ppss usage inside with-silent-modifications.

Yes.

> > It's differently complicated.  master's back_comment, which attempts to
> > scan comments backwards is more complicated than comment-cache's
> > back_comment (including its cacheing logic).

> Ideally we'd have the best of both worlds, of course. Like mentioned 
> above, I see no hard need for backward scanning anymore.

But for reasonable execution speeds you either need backward scanning of
comments, or comment-cache (or something like it).

> >> Yes, it is worse. You have more code to debug. And comment-cache adds
> >> quite a bit of code.

> > How have you quantified "quite a bit"?

> 771 insertions(+), 402 deletions? Admittedly, this is not a lot by C 
> standards.

I don't think it is, either.  A good deal of that is the wholesale
replacement of back_comment with the simpler new version.

> > There is nothing to indicate you've even looked at comment-cache.  All

> I've looked at it now. Since it's implemented in C, I have little 
> ability to judge the quality of the code, or the low-level nuances.

> And yet, I've managed to provide coherent comments, haven't I?

You have, yes.  Thanks.

[ .... ]

> >> So the "speed up forward-comment" patch would still come out to 20 lines.

> > Well, if you get a decent bug fix involving, say a 700 line patch which
> > includes those 20 lines, I suppose you could still call it a 20 line
> > patch, somehow.

> Even if that takes 700 lines, in the end it will be 700 + 20 lines 
> versus 700 + 370 lines that comment-cache takes.

I think it is the result that counts, not the number of changed lines.

> >>> It would also likely be much slower.

> >> I wouldn't be so sure. A syntax table comparison, for instance, would be
> >> pretty cheap compared to what syntax-ppss does already.

> > Full syntax-table comparisons are slow, even when written in C.

> Really? How do you quantify that? In c++-mode,

> (benchmark 1000 '(equal (syntax-table) (syntax-table)))

> outputs "Elapsed time: 0.004712s". Which is an order of magnitude less 
> than (benchmark 1000 '(syntax-ppss)) outputs, in an empty buffer with a 
> warmed-up cache.

> > I tried
> > it back in December.  CC Mode regularly switches syntax-tables.  My
> > usual time-scroll function on xdisp.c ran at about half the speed when a
> > comparison was done at every set-syntax-table.  The results had to be
> > cached, after which it ran at normal speed again.

> That doesn't tell me a lot, unfortunately. Maybe it was a design 
> problem, e.g. invalidating cache eagerly and too often, instead of doing 
> it lazily like syntax-ppss does.

It was a case of seeing if two distinct syntax tables were "the same"
from the point of view of literals.  In other words, they could parse
parentheses, whitespace and so on however they liked, but comments and
strings had to be parsed identically by both tables for them to count
"the same".

This is an instance where syntax-ppss's ambitions count against it - on
any set-syntax-table syntax-ppss's caches should really be cleared,
strictly speaking.  But they're less effective as caches if this is done.
Perhaps the major mode should give its input.

> Although CC Mode would have to change syntax tables a lot, for it to 
> even show up on the radar.

It does.  For example, there's a `c-make-no-parens-syntax-table' in which
parens/braces/brackets are not paren characters, used for parsing
template/generic delimiters in C++ and Java.

> It's possible that your "compare syntax tables" routine does a lot, of 
> course. But if we really need that kind of fuzzy comparison, we can 
> implement that function in C and export for using in Lisp.

I think that's what I did.

-- 
Alan Mackenzie (Nuremberg, Germany).
[Prev in Thread]
Current Thread
[Next in Thread]
Re: Bug #25608 and the comment-cache branch, (continued)
Prev by Date: [PATCH 1/2] Refactor digest authentication in url-auth
Next by Date: Re: Using CEDET modules from Emacs core
Previous by thread: Re: Bug #25608 and the comment-cache branch
Next by thread: Re: Bug #25608 and the comment-cache branch
Index(es):
- Date
- Thread