qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH] main-loop: Unconditionally unlock iothread


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [RFC PATCH] main-loop: Unconditionally unlock iothread
Date: Tue, 02 Apr 2013 13:11:13 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130311 Thunderbird/17.0.4

Il 02/04/2013 11:04, Peter Crosthwaite ha scritto:
> Public bug: 1154328
> Broken Commit: a29753f8aa79a34a324afebe340182a51a5aef11
> 
> ATM, the timeout from g_pollfds_fill is inhibiting unlocking of the
> iothread. This is capable of causing a total deadlock when hw/serial
> is used as a device. The bug manifests when you go -nographic -serial
> mon:stdio and then paste 40 or more chars into the terminal.
> 
> My knowledge of this g_foo is vague at best, but my best working
> theory is this:
> 
> - First 8 chars are recieved by the serial device no complaints.
> - The next 32 chars, serial returns false for can_receive() so they
>       are buffered by the MuxDriver object - mux_chr_read()
> - Buffer is full, so 41st char causes false return from Muxes own
>       can_read()
> - This propagates all the way up to glib_pollfds_fill and manifests
>       as a timeout

I suppose you mean "manifests as timeout==0".  The question is *which*
GSource has a timeout of zero?  Not the mux's: if mux_chr_can_read()
returns zero, the prepare function returns FALSE without touching the
timeout at all...

static gboolean io_watch_poll_prepare(GSource *source, gint *timeout_)
{
    IOWatchPoll *iwp = io_watch_poll_from_source(source);

    iwp->max_size = iwp->fd_can_read(iwp->opaque);
    if (iwp->max_size == 0) {
        return FALSE;
    }

    return g_io_watch_funcs.prepare(source, timeout_);
}

> - Timeout means no unlock of IOthread. Device land never sees any more
>       cycles so the serial port never progresses - no flushing of
>       buffer

Still, this is plausible, so the patch looks correct.

Paolo

> - Deadlock
> 
> Tested on petalogix_ml605 microblazeel machine model, which was faulty
> due to 1154328.
> 
> Fix by removing the conditions on unlocking the iothread. Don't know
> what else this will break but the timeout is certainly the wrong
> condition for the unlock. Probably the real solution is to have a more
> selective unlock policy.
> 
> I'm happy for someone to take this patch off my hands, or educate me on
> the correct implementation. For the peeps doing automated testing on
> nographic platforms this will get your build working again.
> 
> Signed-off-by: Peter Crosthwaite <address@hidden>
> ---
>  main-loop.c |    8 ++------
>  1 files changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/main-loop.c b/main-loop.c
> index eb80ff3..a376898 100644
> --- a/main-loop.c
> +++ b/main-loop.c
> @@ -194,15 +194,11 @@ static int os_host_main_loop_wait(uint32_t timeout)
>  
>      glib_pollfds_fill(&timeout);
>  
> -    if (timeout > 0) {
> -        qemu_mutex_unlock_iothread();
> -    }
> +    qemu_mutex_unlock_iothread();
>  
>      ret = g_poll((GPollFD *)gpollfds->data, gpollfds->len, timeout);
>  
> -    if (timeout > 0) {
> -        qemu_mutex_lock_iothread();
> -    }
> +    qemu_mutex_lock_iothread();
>  
>      glib_pollfds_poll();
>      return ret;
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]