qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/5] monitor: screen_dump async


From: Alon Levy
Subject: Re: [Qemu-devel] [PATCH 1/5] monitor: screen_dump async
Date: Wed, 26 Oct 2011 13:48:39 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

On Tue, Oct 25, 2011 at 04:41:18PM -0200, Luiz Capitulino wrote:
> On Tue, 25 Oct 2011 15:21:11 +0200
> Alon Levy <address@hidden> wrote:
> 
> > On Tue, Oct 25, 2011 at 10:51:30AM -0200, Luiz Capitulino wrote:
> > > On Tue, 25 Oct 2011 12:13:09 +0200
> > > Alon Levy <address@hidden> wrote:
> > > 
> > > > On Mon, Oct 24, 2011 at 10:31:48PM -0200, Luiz Capitulino wrote:
> > > > > On Mon, 24 Oct 2011 19:29:37 +0200
> > > > > Alon Levy <address@hidden> wrote:
> > > > > 
> > > > > > On Mon, Oct 24, 2011 at 01:45:16PM -0200, Luiz Capitulino wrote:
> > > > > > > On Mon, 24 Oct 2011 17:13:14 +0200
> > > > > > > Gerd Hoffmann <address@hidden> wrote:
> > > > > > > 
> > > > > > > > On 10/24/11 14:02, Alon Levy wrote:
> > > > > > > > > Make screen_dump monitor command an async command to allow 
> > > > > > > > > next for qxl
> > > > > > > > > to implement it as a initiating call to red_worker and 
> > > > > > > > > completion on
> > > > > > > > > callback, to fix a deadlock when issueing a screendump 
> > > > > > > > > command via
> > > > > > > > > libvirt while connected with a libvirt controlled spice-gtk 
> > > > > > > > > client.
> > > > > > > > 
> > > > > > > > Approach looks reasonable to me.  Patch breaks the build 
> > > > > > > > though, you've
> > > > > > > > missed a bunch of screen_dump functions in non-x86 targets.
> > > > > > > 
> > > > > > > There are two problems actually.
> > > > > > > 
> > > > > > > The first one is that changing an existing command from 
> > > > > > > synchronous
> > > > > > > to asynchronous is an incompatible change because asynchronous 
> > > > > > > commands
> > > > > > > semantics is different. For an example of possible problems please
> > > > > > > check: https://bugzilla.redhat.com/show_bug.cgi?id=623903.
> > > > > > > 
> > > > > > > The second problem is that the existing asynchronous interface in 
> > > > > > > the
> > > > > > > monitor is incomplete and has never been used for real. Our plan 
> > > > > > > is to
> > > > > > > use QAPI's async support, but that has not landed in master yet 
> > > > > > > and iirc
> > > > > > > there wasn't consensus about it. I also think it's a bit late for 
> > > > > > > its
> > > > > > > inclusion in 1.0 (and certainly not a candidate for stable).
> > > > > > > 
> > > > > > > If all you need here is to delay sending the response, then maybe 
> > > > > > > the
> > > > > > > current interface could work (although I honestly don't trust it 
> > > > > > > and
> > > > > > > regret not having dropped it). Otherwise our only choice would be 
> > > > > > > to
> > > > > > > work on getting the QAPI async support merged.
> > > > > > 
> > > > > > My problem is that the io thread keeps the global mutex during the 
> > > > > > wait,
> > > > > > that's why the async monitor is perfect for what I want - this is
> > > > > > exactly what it does.
> > > > > 
> > > > > Let's not mix internal implementation details with what we want as
> > > > > an external interface.
> > > > > 
> > > > > Can't you just make a vga_hw_screen_dump() specific callback?
> > > > > 
> > > > 
> > > > I don't understand how that would help - if the monitor command doesn't
> > > > return (normal sync operation) then the mutex is never dropped, and any
> > > > callback won't change that.
> > > 
> > > I'm trying to figure out a different solution.
> > > 
> > > Our primary motivation for making a QMP command asynchronous must be to
> > > give clients the ability to keep issuing commands while "slow" commands
> > > are running. If that's not what you want nor your primary motivation,
> > > then you're probably taking the wrong approach.
> > 
> > That sounds right, and it was what I assumed the async monitor
> > implementation would do, boy was I surprised to discover it doesn't do
> > any such thing, but what it does do is return early, allow *other* io
> > related events (select returns) to be handled, and keeps the serialized
> > only-one-command-ongoing monitor usage.
> 
> Yes, if that turns out to be needed internally then we'll have to add
> such functionality (but probably not as part of QMP's async support iiuc).
> 
> > > If that is what you want, then you'll have to add a new command, because
> > > changing from asynchronous to synchronous is an incompatible change _and_
> > > you shouldn't use the current interface, because it's botched (actually,
> > > I think I'm going to drop it right now as my last series fixed its only 
> > > user).
> > > 
> > 
> > This is not what I want. I understand of course that it is what one
> > would design an async monitor / api to allow (it was after all what I
> > thought async monitor support meant).
> > 
> > > Using a botched interface that doesn't do what's supposed to do but 
> > > happens to
> > > solve a bug as a side effect will very likely end in tears at some point 
> > > in
> > > the future.
> > > 
> > 
> > Right, but it's the opposite of the current case.
> > 
> > > Now, I did some research and found this description of the problem:
> > > 
> > > """
> > > In testing my patches for 'add_client' support with SPICE, I noticed
> > > deadlock in the QEMU/SPICE code. It only happens if I close a SPICE
> > > client and then immediately reconnect within about 1 second. If I
> > > wait a couple of seconds before reconnecting the SPICE client I don't
> > > see the deadlock.
> > > """
> > > (http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg02599.html)
> > > 
> > > Is this accurate? Why does it _work_ after 1 second?
> > >
> > 
> > This is an unrelated bug, I know I said different on the thread but now
> > rereading the callstack it is the channel_event locking workaround - and
> > it is fixed by a spice-server only patch (which stops calling
> > channel_event from the worker thread).
> 
> So, can you try to describe the problem you're having to someone who is
> not familiar with qlx and its locking?
> 

I'm suddenly not sure this patchset even fixes the problem that I said
it does, so I'll update once I made sure what it actually fixes.

> > 
> > > > On the other hand, thinking a bit about the reference to 623903 baloon
> > > > bug, I don't see a problem: the change doesn't affect the semantics for
> > > > any other device except qxl, which I've tested. For any other device,
> > > > the only difference is that instead of:
> > > > 
> > > > do_screen_dump call
> > > >  device specific implementation
> > > >  return
> > > > 
> > > > It becomes
> > > > 
> > > > do_screen_dump call
> > > >  device specific implementation (not qxl)
> > > >   callback called (always - not conditional, no one stores it except
> > > >    qxl)
> > > >  return
> > > 
> > > 
> > > > 
> > > > > > I haven't looked at QAPI async support, but I
> > > > > > understand it's a bit in the future.
> > > > > 
> > > > > Yes, it's not for the immediate term.
> > > > 
> > > 
> > 
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]