qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Possible bug in monitor code


From: Luiz Capitulino
Subject: Re: [Qemu-devel] Possible bug in monitor code
Date: Thu, 23 Jan 2014 13:28:58 -0500

On Thu, 23 Jan 2014 17:33:33 +0200
Stratos Psomadakis <address@hidden> wrote:

> On 01/23/2014 03:54 PM, Luiz Capitulino wrote:
> > On Thu, 23 Jan 2014 08:44:02 -0500
> > Luiz Capitulino <address@hidden> wrote:
> >
> >> On Thu, 23 Jan 2014 19:23:51 +0800
> >> Fam Zheng <address@hidden> wrote:
> >>
> >>> Bcc: 
> >>> Subject: Re: [Qemu-devel] Possible bug in monitor code
> >>> Reply-To: 
> >>> In-Reply-To: <address@hidden>
> >>>
> >>> On Thu, 01/23 12:17, Stratos Psomadakis wrote:
> >>>> On 01/23/2014 05:07 AM, Fam Zheng wrote:
> >>>>> On Wed, 01/22 17:53, Stratos Psomadakis wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> we've encountered a weird issue regarding monitor (qmp and hmp) 
> >>>>>> behavior
> >>>>>> with qemu-1.7 (and qemu-1.5). The following steps will reproduce the 
> >>>>>> issue:
> >>>>>>
> >>>>>>     1) Client A connects to qmp socket with socat
> >>>>>>     2) Client A gets greeting message {"QMP": {"version": ..}
> >>>>>>     3) Client A waits (select on the socket's fd)
> >>>>>>     4) Client B tries to connect to the *same* qmp socket with socat
> >> QMP/HMP can only handle a single client per connection, so this is
> >> not supported. What you could do is to create multiple QMP/HMP instances
> >> on the command-line, but this has never been fully tested so we don't
> >> officially support this either (although it may work).
> >>
> >> Now, not gracefully failing on step 4 is a real bug here. I think the
> >> best thing to do would be to close client's B connection. Patches are
> >> welcome :)
> > Thinking about this again, I think this should already be the case
> > (as I don't believe chardev code is sitting on accept()). So maybe
> > you triggered a race on chardev code or broke something there.
> 
> Yes, the chardev code doesn't accept any more connections until the
> currently active connection is closed.
> 
> However, when a new client tries to connect (while there's another qmp /
> hmp connection active) the kernel, as long as there's room in the socket
> backlog, will return to the client that the connection has been
> successfully established and will queue the connection to be accepted
> later, when qmp actually finishes with its active connection and
> re-executes accept().
> 
> The problem arises if the new client closes the connection before the
> older one is finished. When qmp runs accept() again, it will get a
> socket fd for a client who has now closed the connection. At this point,
> the monitor control event handler will get triggered and try to write /
> flush the greeting message to the client. The client however has closed
> its socket so the write will fail with SIGPIPE / EPIPE. Neither the
> qemu-char nor the monitor code seem to handle this situation.
> 
> Btw, have you tried to reproduce it?

Not yet, I may have some time tomorrow. How reproducible is it for you?

Maybe you could try to reproduce with a different subsystem so that we
can rule out or confirm monitor's involvement? Like -serial?

Another question: have you tried to reproduce with an old qemu version
(say v1.0) to see if this bug always existed? If the bug was introduced
in some recent QEMU version you could try to bisect it.

> 
> Thanks,
> Stratos
> 
> >
> >>>>>>     5) Client B does *NOT* get any greating message
> >>>>>>     6) Client B waits (select on the socket's fd)
> >>>>>>     7) Client B closes connection (kill socat)
> >>>>>>     8) Client A quits too
> >>>>>>     9) Client C connects to qmp socket
> >>>>>>     10) Client C gets *two* greeting messages!!!
> >>>>> Hi Stratos, thank you for debugging and reporting this.
> >>>>>
> >>>>> I tested this sequence but can't fully reproduce this. What I see is 5) 
> >>>>> but no
> >>>>> 10). Client C acts normally. And your patch below doesn't solve it for 
> >>>>> me.
> >>>> Hm, which qemu version (or repo branch / tag) did you use? We did a
> >>>> quick scan of the master branch code / commits, but we didn't find
> >>>> anything that might fix the issue.
> >>> I tried on qemu.git master, and also 1.7. I think it is a bug: in my 
> >>> test, step
> >>> 5), B not getting any greeting message.
> >>>
> >>> But I get only one greeting message in step 10), which is a bit different 
> >>> from
> >>> what you reported.
> >>>
> >>> And no difference with your patch applied.
> >>>
> >>> Cc'ing Luiz who maintains the monitor code and may have more ideas.
> >>>
> >>> Thanks,
> >>>
> >>> Fam
> >>>
> >>>>> To submit a patch, please follow instructions as described in
> >>>>> http://wiki.qemu.org/Contribute/SubmitAPatch
> >>>>> so it could be picked up by maintainers. Specifically, you need to 
> >>>>> format your
> >>>>> patch email with "git format-patch" and add a "Signed-off-by:" line in 
> >>>>> your
> >>>>> patch email.
> >>>> Ok. If any dev can confirm that this is a bug (and that the patch below
> >>>> is the correct way to fix it) I'll resubmit it properly.
> >>>>
> >>>> Thanks,
> >>>> Stratos
> >>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Fam
> >>>>>
> >>>>>> After some investigation, we traced it down to the monitor_flush()
> >>>>>> function in monitor.c. Specifically, when a second client connects to
> >>>>>> the qmp (client B), while another one is already using it (client A), 
> >>>>>> we
> >>>>>> get the following from stracing the second client (client B):
> >>>>>>
> >>>>>>     connect(3, {sa_family=AF_FILE, path="foo.mon"}, 9) = 0
> >>>>>>     getsockname(3, {sa_family=AF_FILE, NULL}, [2]) = 0
> >>>>>>     select(4, [0 3], [1 3], [], NULL)       = 2 (out [1 3])
> >>>>>>     select(4, [0 3], [], [], NULL
> >>>>>>
> >>>>>> So, the connect() syscall from client B succeeds, although client B
> >>>>>> connection has not yet been accepted by the qmp server (it's still in
> >>>>>> the backlog of the qmp listening socket).
> >>>>>>
> >>>>>> After killing client B and then client A, we see the following when
> >>>>>> stracing the qemu proc:
> >>>>>>
> >>>>>>     22363 accept4(6, {sa_family=AF_FILE, NULL}, [2], SOCK_CLOEXEC) = 9
> >>>>>>     22363 fcntl(9, F_GETFL)                 = 0x2 (flags O_RDWR)
> >>>>>>     22363 fcntl(9, F_SETFL, O_RDWR|O_NONBLOCK) = 0
> >>>>>>     22363 fstat(9, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
> >>>>>>     22363 fcntl(9, F_GETFL)                 = 0x802 (flags
> >>>>>>     O_RDWR|O_NONBLOCK)
> >>>>>>     22363 write(9, "{\"QMP\": {\"version\": {\"qemu\": {\"m"..., 127) =
> >>>>>>     -1 EPIPE (Broken pipe)
> >>>>>>     22363 --- SIGPIPE (Broken pipe) @ 0 (0) ---
> >>>>>>
> >>>>>> The qmp server / qemu accepts the connection from client B (who has now
> >>>>>> closed the connection) and tries to write the greeting message to the
> >>>>>> socket fd. This results in write returning an error (EPIPE).
> >>>>>>
> >>>>>> The monitor_flush() function doesn't seem to handle this case (write
> >>>>>> error). Instead, it adds a watch / handler to retry the write 
> >>>>>> operation.
> >>>>>> Thus, mon->outbuf is not cleaned up properly, which results in 
> >>>>>> duplicate
> >>>>>> greeting messages for the next client to connect.
> >>>>>>
> >>>>>> The following seems to do the trick.
> >>>>>>
> >>>>>> diff --git a/monitor.c b/monitor.c
> >>>>>> index 845f608..5622f20 100644
> >>>>>> --- a/monitor.c
> >>>>>> +++ b/monitor.c
> >>>>>> @@ -288,8 +288,8 @@ void monitor_flush(Monitor *mon)
> >>>>>>  
> >>>>>>      if (len && !mon->mux_out) {
> >>>>>>          rc = qemu_chr_fe_write(mon->chr, (const uint8_t *) buf, len);
> >>>>>> -        if (rc == len) {
> >>>>>> -            /* all flushed */
> >>>>>> +        if ((rc < 0 && errno != EAGAIN) || (rc == len)) {
> >>>>>> +            /* all flushed or error */
> >>>>>>              QDECREF(mon->outbuf);
> >>>>>>              mon->outbuf = qstring_new();
> >>>>>>              return;
> >>>>>>
> >>>>>> Comments?
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Stratos
> >>>>>>
> >>>>>> -- 
> >>>>>> Stratos Psomadakis
> >>>>>> <address@hidden>
> >>>>>>
> >>>>
> >>>> -- 
> >>>> Stratos Psomadakis
> >>>> <address@hidden>
> >>>>
> >>>>
> >>>
> 
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]