qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH RFC] monitor: temporary fix for dead-lock on eve


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH RFC] monitor: temporary fix for dead-lock on event recursion
Date: Tue, 31 Jul 2018 09:05:03 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)

Marc-André Lureau <address@hidden> writes:

> With a Spice port chardev, it is possible to reenter
> monitor_qapi_event_queue() (when the client disconnects for
> example). This will dead-lock on monitor_lock.
>
> Instead, use some TLS variables to check for recursion and queue the
> events.
[...]
>
> Note that error report is now moved to the first caller, which may
> receive an error for a recursed event. This is probably fine (95% of
> callers use &error_abort, the rest have NULL error and ignore it)

I'm not 100% sure I understand this paragraph, but it might be moot; see
[*] below.

>
> Signed-off-by: Marc-André Lureau <address@hidden>
> ---
>  monitor.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 50 insertions(+), 1 deletion(-)
>
> diff --git a/monitor.c b/monitor.c
> index d8d8211ae4..d580c5a79c 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -633,7 +633,7 @@ static void monitor_qapi_event_handler(void *opaque);
>   * applying any rate limiting if required.
>   */
>  static void
> -monitor_qapi_event_queue(QAPIEvent event, QDict *qdict, Error **errp)
> +monitor_qapi_event_queue_no_recurse(QAPIEvent event, QDict *qdict, Error 
> **errp)
>  {
>      MonitorQAPIEventConf *evconf;
>      MonitorQAPIEventState *evstate;
> @@ -688,6 +688,55 @@ monitor_qapi_event_queue(QAPIEvent event, QDict *qdict, 
> Error **errp)
>      qemu_mutex_unlock(&monitor_lock);
>  }
>  
> +static void
> +monitor_qapi_event_queue(QAPIEvent event, QDict *qdict, Error **errp)
> +{
> +    Error *local_err = NULL;
> +    /*
> +     * If the function recurse, monitor_lock will dead-lock.
> +     * Instead, queue pending events in TLS.

recurses

The claim is correct before the patch.  But the comment needs to explain
current code.  Perhaps:

        * monitor_qapi_event_queue_no_recurse() is not reentrant: it
        * would deadlock on monitor_lock.  Work around by queueing
        * events in thread-local storage.

> +     * TODO: remove this, make it re-enter safe.
> +     */
> +    static __thread bool recurse;
> +    typedef struct MonitorQapiEvent {
> +        QAPIEvent event;
> +        QDict *qdict;
> +        QSIMPLEQ_ENTRY(MonitorQapiEvent) entry;
> +    } MonitorQapiEvent;
> +    MonitorQapiEvent *ev;
> +    static __thread QSIMPLEQ_HEAD(, MonitorQapiEvent) event_queue;

Let's put the static variables first.

> +
> +    if (!recurse) {
> +        QSIMPLEQ_INIT(&event_queue);
> +    }
> +
> +    ev = g_new(MonitorQapiEvent, 1);
> +    ev->qdict = qobject_ref(qdict);
> +    ev->event = event;
> +    QSIMPLEQ_INSERT_TAIL(&event_queue, ev, entry);
> +    if (recurse) {
> +        return;
> +    }
> +
> +    recurse = true;
> +
> +    while ((ev = QSIMPLEQ_FIRST(&event_queue)) != NULL) {
> +        QSIMPLEQ_REMOVE_HEAD(&event_queue, entry);

Could we use QSIMPLEQ_FOREACH_SAFE()?

> +        if (!local_err) {
> +            monitor_qapi_event_queue_no_recurse(ev->event, ev->qdict,
> +                                                &local_err);
> +        }

[*] This looks scary: we silently throw away events after event queuing
fails.

Fortunately, monitor_qapi_event_queue_no_recurse() cannot fail.  It
takes an Error ** parameter only so you can put it into @qmp_emit.

Aside: I wish we'd get rid of the indirection through @qmp_emit.

Let's pass &error_abort and drop @local_err.

> +        qobject_unref(ev->qdict);
> +        g_free(ev);
> +    }
> +
> +    recurse = false;

Aha: @recurse means we've reentered the function.

"Recurse" is imperative mood.  Misleading, as it's not an order to
recurse.

Rename to @recursed?  @reentered?

> +
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +    }
> +}
> +
>  /*
>   * This function runs evconf->rate ns after sending a throttled
>   * event.

monitor_lock clearly needs a rethink.  Your TODO comment suggests you
agree.  But that's something for 3.1.

I hate messing with locks at -rc3, but I also hate shipping known
deadlocks.  Your patch isn't pretty, but probably as simple as we can
make it for 3.0.  A few more review eyeballs would be nice.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]