qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 2/6] monitor: flush qmp responses when CLOSED


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH v3 2/6] monitor: flush qmp responses when CLOSED
Date: Fri, 15 Jun 2018 10:11:34 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)

Peter Xu <address@hidden> writes:

> Previously we clean up the queues when we got CLOSED event.  It was used
> to make sure we won't send leftover replies/events of a old client to a
> new client.  Now this patch does an extra operation to flush the
> response queue before cleaning up.
>
> In most cases, a QMP session will be based on a bidirectional channel (a
> TCP port, for example, we read/write to the same socket handle), so in
> port and out port of the backend chardev are fundamentally the same
> port. In these cases, it does not really matter much on whether we'll
> flush the response queue since flushing should possibly fail

s/should possibly fail anyways/will fail anyway/

> anyways (just imagine to write to a socket that has already closed).
> However there can be cases where in & out ports of the QMP monitor's
> backend chardev are separated.  One true example:

Suggest to drop "true", or say "Here's an example".

>   cat $QMP_COMMANDS | qemu -qmp stdio ... | filter_commands
>
> In this case, the backend is fd-typed, and it is connected to stdio
> where in port is stdin and out port is stdout.  Now if we drop all the
> events on the response queue then filter_command process might miss some
> events that it might expect.  The thing is that, when stdin closes,
> stdout might still be there alive!
>
> In practice, I encountered a very strange SHUTDOWN event missing when

The event isn't strange, its missing is (sort of).  Suggest to drop
"very strange".

> running test with iotest 087 with Out-Of-Band enabled. One condition
> could be this (after "quit" command is executed and QEMU quits the main
> loop):

"could be" suggests this may or may not demonstrate the loss of the
event.  Misleading.  "Here's one of the ways this can happen" would be
clearer.

>
> 1. [main thread] QEMU queues one SHUTDOWN event into response queue

s/one/a/

>
> 2. "cat" terminates (to distinguish it from the animal, I quote it).
>    Logically it can terminate even earlier, but let's just assume it's
>    here.

Suggest to drop the second sentence.  You're describing one way for the
bug to bite here, not all possible ways.

>
> 3. [monitor iothread] QEMU reads EOF from stdin, which connects to the
>    "cat" process

Suggest "QEMU's monitor iothread reads".

Suggest to drop ", which connects...", because that should be obvious.

> 4. [monitor iothread] QEMU calls the CLOSED event hook for the monitor,

"QEMU's monitor iothread" again.

>    which will clean up the response queue of the monitor, then the
>    SHUTDOWN event is dropped

Suggest to say "destroy" instead of "clean up", to make perfectly clear
that we're throwing away the queue.
>
> 5. [main thread] clean up the monitors in monitor_cleanup(), when

"QEMU's main thread cleans up the monitors in monitor_cleanup().  When"

>    trying to flush pending responses, it sees nothing.  SHUTDOWN is
>    lost forever

Missing a period.

> Note that before the monitor iothread was introduced, step [4]/[5] could
> never happen since the main loop was the only place to detect the EOF
> event of stdin and run the CLOSED event hooks.  Now things can happen in
> parallel in the iothread.
>
> Without this patch, iotest 087 will have ~10% chance to miss the
> SHUTDOWN event and fail when with Out-Of-Band enabled:
>
> 087 8s ... - output mismatch (see 087.out.bad)
> --- /home/peterx/git/qemu/tests/qemu-iotests/087.out    2018-06-01 
> 18:44:22.378982462 +0800
> +++ /home/peterx/git/qemu/bin/tests/qemu-iotests/087.out.bad    2018-06-01 
> 18:53:44.267840928 +0800
> @@ -8,7 +8,6 @@
>  {"return": {}}
>  {"error": {"class": "GenericError", "desc": "'node-name' must be specified 
> for the root node"}}
>  {"return": {}}
> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "SHUTDOWN", "data": {"guest": false}}
>
>  === Duplicate ID ===
> @@ -53,7 +52,6 @@
>  {"return": {}}
>  {"return": {}}
>  {"return": {}}
> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "SHUTDOWN", "data": {"guest": false}}

Suggest to indent the quoted diff a bit, so make it more obviously not
part of the patch.  In fact, git-am chokes on it for me.

> This patch fixes the problem.
>
> Fixes: 6d2d563f8c ("qmp: cleanup qmp queues properly", 2018-03-27)
> Suggested-by: Markus Armbruster <address@hidden>
> Signed-off-by: Peter Xu <address@hidden>
>
> Signed-off-by: Peter Xu <address@hidden>

Don't let my nitpicking deceive you; this is a really nice commit
message.

One more suggestion.  Your first paragraph describes the solution.  I
like to start with the problem instead.  More so when the problem
description is long, as it is here, because by the time I'm done reading
the problem, I'm prone to have forgotten the solution, and go "okay, and
how's the patch addressing the problem?"  But please use your judgement.

> ---
>  monitor.c | 30 +++++++++++++++++++++++++-----
>  1 file changed, 25 insertions(+), 5 deletions(-)
>
> diff --git a/monitor.c b/monitor.c
> index 6d0cec552e..e59d4f09ac 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -512,20 +512,39 @@ struct QMPResponse {
>  };
>  typedef struct QMPResponse QMPResponse;
>  
> +static QObject *monitor_qmp_response_pop_one(Monitor *mon)
> +{
> +    QObject *data;
> +
> +    qemu_mutex_lock(&mon->qmp.qmp_queue_lock);
> +    data = g_queue_pop_head(mon->qmp.qmp_responses);
> +    qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
> +
> +    return data;
> +}

Note for later: returns null only when the queue is empty (we never push
null onto this queue).

> +
> +static void monitor_qmp_response_flush(Monitor *mon)
> +{
> +    QObject *data;
> +
> +    while ((data = monitor_qmp_response_pop_one(mon))) {
> +        monitor_json_emitter_raw(mon, data);
> +        qobject_unref(data);
> +    }
> +}

This flushes the queue to @mon.  Good.

> +
>  /*
>   * Return one QMPResponse.  The response is only valid if
>   * response.data is not NULL.
>   */
> -static QMPResponse monitor_qmp_response_pop_one(void)
> +static QMPResponse monitor_qmp_response_pop(void)
>  {
>      Monitor *mon;
>      QObject *data = NULL;
>  
>      qemu_mutex_lock(&monitor_lock);
>      QTAILQ_FOREACH(mon, &mon_list, entry) {
> -        qemu_mutex_lock(&mon->qmp.qmp_queue_lock);
> -        data = g_queue_pop_head(mon->qmp.qmp_responses);
> -        qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
> +        data = monitor_qmp_response_pop_one(mon);
>          if (data) {
>              break;
>          }
       }
       qemu_mutex_unlock(&monitor_lock);
       return (QMPResponse) { .mon = mon, .data = data };
   }

Not this patch's fault, but this function gave me pause until I got the
loop's purpose: the function returns a response from *any* monitor.

Moreover, your naming creates an inconsistency with
monitor_qmp_requests_pop_one().

I append my attempt to make it easier to understand, and restore
consistency.  I guess I'd split it into a preparatory patch and a fixup
for this one.

> @@ -539,7 +558,7 @@ static void monitor_qmp_bh_responder(void *opaque)
>      QMPResponse response;
>  
>      while (true) {
> -        response = monitor_qmp_response_pop_one();
> +        response = monitor_qmp_response_pop();
>          if (!response.data) {
>              break;
>          }
> @@ -4366,6 +4385,7 @@ static void monitor_qmp_event(void *opaque, int event)
>          mon_refcount++;
>          break;
>      case CHR_EVENT_CLOSED:
> +        monitor_qmp_response_flush(mon);

This wastes work in the common case when @mon's output fd has been
closed already.  I think that's just fine.

>          monitor_qmp_cleanup_queues(mon);
>          json_message_parser_destroy(&mon->qmp.parser);
>          json_message_parser_init(&mon->qmp.parser, handle_qmp_command);


diff --git a/monitor.c b/monitor.c
index e59d4f09ac..af28977549 100644
--- a/monitor.c
+++ b/monitor.c
@@ -512,7 +512,7 @@ struct QMPResponse {
 };
 typedef struct QMPResponse QMPResponse;
 
-static QObject *monitor_qmp_response_pop_one(Monitor *mon)
+static QObject *monitor_qmp_requests_pop(Monitor *mon)
 {
     QObject *data;
 
@@ -527,41 +527,39 @@ static void monitor_qmp_response_flush(Monitor *mon)
 {
     QObject *data;
 
-    while ((data = monitor_qmp_response_pop_one(mon))) {
+    while ((data = monitor_qmp_requests_pop(mon))) {
         monitor_json_emitter_raw(mon, data);
         qobject_unref(data);
     }
 }
 
 /*
- * Return one QMPResponse.  The response is only valid if
- * response.data is not NULL.
+ * Pop a QMPResponse from any monitor's response queue into @response.
+ * Return NULL when all queues are empty, else @response.
  */
-static QMPResponse monitor_qmp_response_pop(void)
+static QMPResponse *monitor_qmp_response_pop_any(QMPResponse *response)
 {
     Monitor *mon;
     QObject *data = NULL;
 
     qemu_mutex_lock(&monitor_lock);
     QTAILQ_FOREACH(mon, &mon_list, entry) {
-        data = monitor_qmp_response_pop_one(mon);
+        data = monitor_qmp_requests_pop(mon);
         if (data) {
+            response->mon = mon;
+            response->data = data;
             break;
         }
     }
     qemu_mutex_unlock(&monitor_lock);
-    return (QMPResponse) { .mon = mon, .data = data };
+    return data ? response : NULL;
 }
 
 static void monitor_qmp_bh_responder(void *opaque)
 {
     QMPResponse response;
 
-    while (true) {
-        response = monitor_qmp_response_pop();
-        if (!response.data) {
-            break;
-        }
+    while (monitor_qmp_response_pop_any(&response)) {
         monitor_json_emitter_raw(response.mon, response.data);
         qobject_unref(response.data);
     }
@@ -4126,7 +4124,7 @@ static void monitor_qmp_dispatch_one(QMPRequest *req_obj)
  * when we process one request on a specific monitor, we put that
  * monitor to the end of mon_list queue.
  */
-static QMPRequest *monitor_qmp_requests_pop_one(void)
+static QMPRequest *monitor_qmp_requests_pop_any(void)
 {
     QMPRequest *req_obj = NULL;
     Monitor *mon;
@@ -4158,7 +4156,7 @@ static QMPRequest *monitor_qmp_requests_pop_one(void)
 
 static void monitor_qmp_bh_dispatcher(void *data)
 {
-    QMPRequest *req_obj = monitor_qmp_requests_pop_one();
+    QMPRequest *req_obj = monitor_qmp_requests_pop_any();
 
     if (req_obj) {
         trace_monitor_qmp_cmd_in_band(qobject_get_try_str(req_obj->id) ?: "");



reply via email to

[Prev in Thread] Current Thread [Next in Thread]