[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] lock-free monitor?
From: |
Dr. David Alan Gilbert |
Subject: |
Re: [Qemu-devel] lock-free monitor? |
Date: |
Wed, 10 Feb 2016 15:33:42 +0000 |
User-agent: |
Mutt/1.5.24 (2015-08-30) |
* Markus Armbruster (address@hidden) wrote:
> "Dr. David Alan Gilbert" <address@hidden> writes:
>
> > * Markus Armbruster (address@hidden) wrote:
> >> "Dr. David Alan Gilbert" <address@hidden> writes:
> >>
> >> > Hi,
> >> > I wondered what it would take to be able to do a lock-free monitor;
> >> > i.e. one that could respond to (some) commands while the qemu big lock
> >> > is held.
> >>
> >> Requires a careful audit of the monitor code.
> >>
> >> For instance, cur_mon needs to be made thread local for running multiple
> >> monitors concurrently.
> >
> > Hmm that's fun; and cur_mon is used all over the place when 'fd' passing
> > is used - although that probably should be cleaned up somehow.
>
> When cur_mon got created, we had no threads. Even now, monitors only
> ever run under the BQL, so cur_mon's still fine.
>
> Having a current monitor is simpler than passing one around, especially
> when you go through layers that don't know about it. Such cases exist,
> and they made us abandon commit 376253e's hope to eliminate cur_mon.
> Unfortunately, I've since forgotten the details, so I can't point you to
> an example.
Yes, I think maybe if we can try and remove the use of cur_mon one
use at a time outside of the monitor it might move it in the right direction.
> >> > My reason for asking is that there are cases in migration and colo
> >> > where a network block device has failed and is blocking, but it fails
> >> > at just the wrong time while the lock is held, and now, you don't have
> >>
> >> Is it okay to block while holding the BQL?
> >
> > I'd hope the simple answer was no; unfortunately the more complex answer
> > is that we keep finding places that do.
>
> Would you call these bugs?
>
> Even if you do, we may want to recover from them.
They're a bug in the end result that needs fixing; so for example a
crashing NBD server shouldn't lock you out of the monitor during a migrate,
and I don't think we current;y have other solutions.
> > The cases I'm aware of are
> > mostly bdrv_drain_all calls in migration/colo, where one of the block
> > devices (e.g. an NBD network device) fails. Generally these are called
> > at the end of a migration cycle when we're just ensuring that everything
> > really is drained and are normally called with the lock held to ensure
> > that nothing else is generating new traffic as we're clearing everything
> > else.
>
> Using the BQL for that drives a cork into the I/O pipeline with a Really
> Big Hammer. Can we do better?
>
> The answer may be "yes, but don't hold your breath." Then we may need
> means to better cope with the bugs.
Yeh there are some places that the migraiton code holds the BQL where
I don't really understand all the things it's guarding against.
> >> > any way to unblock it since you can't do anything on the monitors.
> >> > If I could issue a command then I could have it end up doing a
> >> > shutdown(2)
> >> > on the network connection and free things up.
> >> >
> >> > Maybe this would also help real-time people?
> >> >
> >> > I was thinking then, we could:
> >> > a) Have a monitor that wasn't tied to the main loop at all - each
> >> > instance
> >> > would be it's own thread and have it's own poll() (probably using the new
> >> > IO channels?)
> >> > b) for most commands it would take the big lock before execute the
> >> > command
> >> > and release it afterwards - so there would be no risk in those commands.
> >>
> >> Saves you the auditing their code, which would probably be impractical.
> >>
> >> > c) Some commands you could mark as 'lock free' - they would be
> >> > required
> >> > not to take any locks or wait for *anything* and for those the monitor
> >> > would
> >> > not take the lock.
> >>
> >> They also must not access shared state without suitable synchronization.
> >
> > Right; I'd expect most of the cases to either be reading simple status
> > information,
> > or having to find whatever they need to do using rcu list walks.
> >
> >> > d) Perhaps have some way of restricting a particular monitor session
> >> > to only
> >> > allowing non-blocking commands.
> >>
> >> Why? To ensure you always have an emergency monitor available?
> >
> > Yes, mostly to stop screwups of putting a potentially blocking command down
> > your
> > emergency route.
>
> Understand.
>
> >> > e) Then I'd have to have a way of finding the broken device in a
> >> > lockfree
> >> > way (RTU list walk?) and issuing the shutdown(2).
> >> >
> >> > Bonus points to anyone who can statically check commands in (c).
> >>
> >> Tall order.
> >
> > Yes :-) A (compile time selected) dynamic check might be possible
> > where the normal global lock functions and rcu block functions check if
> > they're
> > in a monitor thread and assert.
> >
> >> > Does this make sense to everyone else, or does anyone have any better
> >> > suggestions?
> >>
> >> Sounds like a big, hairy task to me.
> >>
> >> Any alternatives?
> >
> > I hope so; although is the idea of making a lock-free monitor a generally
> > good idea we should do anyway?
>
> There's no shortage of good ideas, but our bandwidth to implement,
> review, document and test them has limits.
>
> The general plan is to reduce the scope of the BQL gradually.
> Liberating the monitor from the BQL is one step on the road to complete
> BQL elimination. The question is whether this step needs to be taken
> *now*.
COLO probably needs something; although in it's case I'm going to investigate
if some evil with iptables might be able to force a recovery by causing the
socket
to close (after all we're assuming that case involves the destination is dead -
but
I don't want to wait for a full TCP timeout).
Dave
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK