emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: wait_reading_process_ouput hangs in certain cases (w/ patches)


From: Eli Zaretskii
Subject: Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
Date: Mon, 10 Sep 2018 11:21:20 +0300

> Cc: address@hidden, address@hidden, address@hidden,
>  address@hidden, address@hidden
> From: Matthias Dahl <address@hidden>
> Date: Tue, 7 Aug 2018 10:38:02 +0200

Sorry for such a long delay in responding to your previous message.

> >> --- a/src/gnutls.c
> >> +++ b/src/gnutls.c
> >> @@ -708,16 +708,18 @@ emacs_gnutls_read (struct Lisp_Process *proc, char 
> >> *buf, ptrdiff_t nbyte)
> >>    rtnval = gnutls_record_recv (state, buf, nbyte);
> >>    if (rtnval >= 0)
> >>      return rtnval;
> >> -  else if (rtnval == GNUTLS_E_UNEXPECTED_PACKET_LENGTH)
> >> -    /* The peer closed the connection. */
> >> -    return 0;
> > 
> > Why is this hunk being deleted?  I see that you intend to handle it in
> > emacs_gnutls_handle_error, but I'm not sure I understand the net
> > result.  The current code simply ignores this situation; what will
> > happen with the new code, except for the problem being logged?
> 
> The rationale behind this is:
> On GnuTLS < 3, GNUTLS_E_UNEXPECTED_PACKET_LENGTH was returned for
> premature connection termination. On >= 3, it is a real error on its
> own and should be treated that way.
> 
> I moved the error handling over to emacs_gnutls_handle_error, so it's
> at a central point and not scattered all across. Also makes the code
> clearer to read, imho.

OK.

> >>    else if (emacs_gnutls_handle_error (state, rtnval))
> >> -    /* non-fatal error */
> >> -    return -1;
> >> -  else {
> >> -    /* a fatal error occurred */
> >> -    return 0;
> >> -  }
> >> +    {
> >> +      /* non-fatal error */
> >> +      errno = EAGAIN;
> >> +      return -1;
> >> +    }
> >> +  else
> >> +    {
> >> +      /* a fatal error occurred */
> >> +      errno = EPROTO;
> >> +      return 0;
> >> +    }
> > 
> > EPROTO is not universally available, AFAIK.  We have Gnulib's errno.h
> > that defines a value for it, but that just gets us through
> > compilation.  What do you expect this value to cause when it is
> > encountered by some code which needs to interpret it?  We need to make
> > sure the results will be sensible.
> 
> Sorry, I missed that. Right now, no code checks specifically for EPROTO,
> so any other fatal value would do as well. But EPROTO is fitting rather
> well in what is going on.
> 
> Any code that will get an errno = EPROTO will treat it as a fatal error
> and that is exactly what it should do. So there is no problem with that.

The problem that got me worried is whether some human-readable text
should be emitted when this error code is returned.  If that does
happen, we need to be sure the corresponding text is defined somewhere
in Emacs, because otherwise we will get "Unknown error" or some such,
which is a bug.

> I don't know if Emacs uses other non-universally available errno values
> as well. If it does, then, imho, it really does not matter here. If it
> does not, I will change it to something else, if requested to avoid the
> situation altogether. What do you think?

What alternatives could we use here, specifically?  I don't think I
understand the details of that, can you elaborate?

> GnuTLS 2.12 changed its default from lowat mode to non-lowat mode (and
> removed lowat mode with GnuTLS 3 completely). What this means is, there
> is no data left intentionally in the kernel buffers for a select to work
> properly. So we usually do not get any of the fds that belong to GnuTLS
> set as available when we do a select() call.
> 
> The old Emacs code only checked _all_ channels if there was no wait_proc
> and if _no_ fds (nfds == 0) were set as available by our previous
> select() call. That was wrong and missed a few important cases and could
> lead to unnecessary waits or stalls.
> 
> Previously, the old code treated a wait_proc as if just_wait_proc was
> set as well and only checked wait_proc and ignored the others. That was
> a bug as well.
> 
> The new code always (if nfds >= 0) checks all (if !just_wait_proc)
> channels and sets the available fds in addition to the ones reported by
> select().
> 
> I hope that clears it up. The new code is basically a lot simpler and it
> tries to do the right thing (tm). :-)

OK.  I get the idea, but we will have to see if this change causes any
harm in some subtle situation.

> >> +      // pselect() clears the file descriptor sets if no fd is ready (but
> >> +      // not if an error occurred), so should we to be compatible. 
> >> (Bug#21337)
> >> +      if (rfds) FD_ZERO (rfds);
> >> +      if (wfds) FD_ZERO (wfds);
> >> +      if (efds) FD_ZERO (efds);
> >> +    }
> > 
> > This is OK, but please don't use C++ style comments, it's not our
> > style.  Also, please be sure to test this when waiting on pselect is
> > interrupted by C-g, we had some problems in that area which had roots
> > in xg_select.
> 
> I will change that with the next version of the patches after I get your
> comments back.

Thanks.

> Regarding the problems you mentioned: Can you point me to a bug
> report and more information about that?

See bug#28630 and commit ea39d47.  Bug#25172 might also be relevant.

> If anything breaks with the new behavior, then that broken code is
> clearly doing something wrong and needs fixing. Personally, I have
> been running Emacs with the patches since I posted them and I have
> not run into a single issue.

Did you try the new code on a TTY frame?  C-g produces a SIGINT there.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]