bug#45117: 28.0.50; process-send-string mysteriously exiting non-locally

bug-gnu-emacs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#45117: 28.0.50; process-send-string mysteriously exiting non-locally

From:	João Távora
Subject:	bug#45117: 28.0.50; process-send-string mysteriously exiting non-locally when called from timer
Date:	Thu, 10 Dec 2020 20:12:11 +0000
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>>> Right.  And `while-no-input` should only wrap the execution of A, so if
>>> A doesn't complete, then presumably none of C nor B will want to be
>>> executed, which seems OK.
>>
>> We are miscommunicating.  In these programs, B needs to be atomic with
>> A.  When you send things into an external process, only the most naive
>> of external communication protocol replies immediately and synchronously
>> to the thing you just sent.  For those super simple things, like "cat"
>> and "grep", your model works.
>
> No, I was no presuming such a simple model, actually.  I was really
> thinking about "send data to the LSP server then get some answer
> a second or more later".

Right, so in LSP it's perfectly possible to send three requests in a
row, say reqX, reqY and reqZ and get three replies in a completely
different order repZ, repX, repY.  How to you match each reply to each
request?

>>> closes the pipe in case we're exiting before having sent all the data
>>> (that's a good idea to do also in case a bug signals an error).
>> Again, this killing of the subprocess assumes the trivial case of a unix
>> utility.
>
> That's just for lack of a vocabulary to say abstractly what I meant.
> I understand that in many cases you may want to keep the subprocess (and
> pipe) open, in which case you'll have to do something else, but that
> "something" will depend on lots of details of the circumstance.

process_send_string may send things in "bunches", I read in the
docstring, but it will not (and should not) be interrupted.  At least I
see no reason to.  When it returns, sending should be done.  Either that
or it should exit loudly with an error that one can catch, in which case
one should retry the whole thing.

>>> The exact same problem affects all normal Elisp code when the user hits
>>> C-g, so I think the better path forward is to make sure it's "easy and
>>> natural" to write code which reacts correctly when it's aborted at some
>>> arbitrary time.  We usually get that via `unwind-protect`, but if it's
>>> not enough we should develop better solutions rather than shy away from
>>> `quit`.
>> I get what you're saying, but there's a presumably reason we bind
>> inhibit-quit to t in timers (Eli?), and it's that that code isn't
>> triggered by a direct action of he user.
>
> Indeed, we bind inhibit-quit there because when the users hit C-g they
> presumably have no idea whether a timer or process filter happens to be
> running right now, so they don't actually mean "stop this timer" but
> something entirely different (such as run the command `keyboard-quit`).

I see, and you you think it is different for "input something", because
that in ElDoc, would in principle invalidate the context of the
documentation request.  But that is not always so.  And I think it's too
eager of ElDoc to try to do that so early and so brutally.  It's better
to leave it to the callback handlers, which we have now.  That's a much
safer spot to know if the answer we just got still makes sense.  Or if
we're in a hurry, we let the backend know asap.

> Note that in return we expect timers and process filters to run only for
> a very short amount of time, so that we can still react to C-g
> promptly.

Fine, and so they should.  Much like Flymake stuff.  That's in the
contract :-)

> The contract is different for timer functions than it is for eldoc
> functions, yes.  This is because the expectation is that eldoc functions
> may run for a non-negligible amount of time.

Why do you have that expectation?  Any particular example in the wild?

> Maybe we should change that so it's up to the individual eldoc function
> to use `while-no-input` if it needs it, but I'm not sure we've reached
> that conclusion yet ;-)

It was, after all, the status quo after you changed it for 27.1.
Perhaps you had a rationale?

>> Yes, there is that too.  While-no-input has all those Heisenbergian
>> effects to add to it.  But this was no heisenberg, I think.  I was
>> pressing C-n the whole time, so that's "input".
>
> OK, so `while-no-input` did its job correctly in your case.  Good.
> Now the next question is: given that the user has hit `C-n` how should
> we make sure Emacs responds to it as soon as possible even though it's
> currently in the middle of sending a command to an LSP subprocess?
>
> Is this "sending" expected to never take a long time (in which case
> maybe using `inhibit-quit` could be the better answer)?

That's what I did, yes.  Yes, it's expected to be quick or fail fast.

> What's the alternative: what could the Elisp code do to abort the
> communication as quickly as possible (without leaving the subprocess in
> an inconsistent state and without forcing a costly restart of that
> subprocess)?  If the protocol doesn't offer any way to abort a command,
> maybe it could stash the rest of the data to be sent on some list of
> pending data so they'll be sent later asynchronously (and remember that
> the answer to that command is probably to be ignored because the user
> has moved on)?

The protocol could offer an optional abort() switch, yes.  ElDoc would
raise a flag and say: "hey backends, what you were doing is now
useless".  We'd see about the implementation, there is likely more than
one approach, but a dynamic variable accessed by an (eldoc-aborted-p)
seems easiest.  I personal don't know of many places where I would use
it, or where it would bring an advantage in terms of speed.  For
example, in responsive completion, I've been doing fine with discarding
loads and loads of carefully prepared, now invalid, completions.  Fine
in terms of speed/responsiveness.  But maybe one wishes to save power,
which is quite legitimate.

Bottom line is, in my opinion, this ElDoc-to-backend abort signal should
be controlled, it shouldn't be an unhandleable kill signal.  That's
asking for trouble.  I'd be very suprised if the SLIME people don't
start getting this too after they upgrade to 27.1.  And maybe the CIDER
and Elpy people too?  Don't know about Eglot, actually, but I think it's
possible yes. All depends on `eldoc-idle-delay`.  If it's a low value,
it's much much more likely.  Since we start with 0.5, we should be OK.

João

[Prev in Thread]

Current Thread

[Next in Thread]

bug#45117: 28.0.50; process-send-string mysteriously exiting non-locally when called from timer, (continued)

Prev by Date: bug#25706: 26.0.50; Slow C file fontification
Next by Date: bug#45117: 28.0.50; process-send-string mysteriously exiting non-locally when called from timer
Previous by thread: bug#45117: 28.0.50; process-send-string mysteriously exiting non-locally when called from timer
Next by thread: bug#45117: 28.0.50; process-send-string mysteriously exiting non-locally when called from timer
Index(es):
- Date
- Thread