Re: [PATCH 4/6] qemu, vhost-user: Extend protocol to start/stop/flush sl

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 4/6] qemu, vhost-user: Extend protocol to start/stop/flush sl

From:	Stefan Hajnoczi
Subject:	Re: [PATCH 4/6] qemu, vhost-user: Extend protocol to start/stop/flush slave channel
Date:	Mon, 8 Feb 2021 17:41:43 +0000

On Fri, Jan 29, 2021 at 10:11:55AM -0500, Vivek Goyal wrote:
> On Fri, Jan 29, 2021 at 09:16:00AM -0500, Vivek Goyal wrote:
> > On Thu, Jan 28, 2021 at 04:52:34PM +0000, Stefan Hajnoczi wrote:
> > > On Mon, Jan 25, 2021 at 01:01:13PM -0500, Vivek Goyal wrote:
> > > > Currently we don't have a mechanism to flush slave channel while 
> > > > shutting
> > > > down vhost-user device and that can result a deadlock. Consider 
> > > > following
> > > > scenario.
> > > > 
> > > > 1. Slave gets a request from guest on virtqueue (say index 1, vq1), to 
> > > > map
> > > >    a portion of file in qemu address space.
> > > > 
> > > > 2. Thread serving vq1 receives this request and sends a message to qemu 
> > > > on
> > > >    slave channel/fd and gets blocked in waiting for a response from 
> > > > qemu.
> > > > 
> > > > 3. In the mean time, user does "echo b > /proc/sysrq-trigger" in guest. 
> > > > This
> > > >    leads to qemu reset and ultimately in main thread we end up in
> > > >    vhost_dev_stop() which is trying to shutdown virtqueues for the 
> > > > device.
> > > > 
> > > > 4. Slave gets VHOST_USER_GET_VRING_BASE message to shutdown a virtqueue 
> > > > on
> > > >    unix socket being used for communication.
> > > > 
> > > > 5. Slave tries to shutdown the thread serving vq1 and waits for it to
> > > >    terminate. But vq1 thread can't terminate because it is waiting for
> > > >    a response from qemu on slave_fd. And qemu is not processing slave_fd
> > > >    anymore as qemu is ressing (and not running main event loop anymore)
> > > >    and is waiting for a response to VHOST_USER_GET_VRING_BASE.
> > > > 
> > > > So we have a deadlock. Qemu is waiting on slave to response to
> > > > VHOST_USER_GET_VRING_BASE message and slave is waiting on qemu to
> > > > respond to request it sent on slave_fd.
> > > > 
> > > > I can reliably reproduce this race with virtio-fs.
> > > > 
> > > > Hence here is the proposal to solve this problem. Enhance vhost-user
> > > > protocol to properly shutdown slave_fd channel. And if there are pending
> > > > requests, flush the channel completely before sending the request to
> > > > shutdown virtqueues.
> > > > 
> > > > New workflow to shutdown slave channel is.
> > > > 
> > > > - Qemu sends VHOST_USER_STOP_SLAVE_CHANNEL request to slave. It waits
> > > >   for an reply from guest.
> > > > 
> > > > - Then qemu sits in a tight loop waiting for
> > > >   VHOST_USER_SLAVE_STOP_CHANNEL_COMPLETE message from slave on slave_fd.
> > > >   And while waiting for this message, qemu continues to process requests
> > > >   on slave_fd to flush any pending requests. This will unblock threads
> > > >   in slave and allow slave to shutdown slave channel.
> > > > 
> > > > - Once qemu gets VHOST_USER_SLAVE_STOP_CHANNEL_COMPLETE message, it 
> > > > knows
> > > >   no more requests will come on slave_fd and it can continue to shutdown
> > > >   virtqueues now.
> > > > 
> > > > - Once device starts again, qemu will send 
> > > > VHOST_USER_START_SLAVE_CHANNEL
> > > >   message to slave to open the slave channel and receive requests.
> > > > 
> > > > IOW, this allows for proper shutdown of slave channel, making sure
> > > > no threads in slave are blocked on sending/receiving message. And
> > > > this in-turn allows for shutting down of virtqueues, hence resolving
> > > > the deadlock.
> > > 
> > > Is the new message necessary?
> > 
> > Hi Stefan,
> > 
> > It probably is not necessary but it feels like a cleaner and simpler
> > solution.
> > 
> > There slave is a separate channel from virtqueues. And if device is
> > stopping, we want to make sure there are no pending requests in
> > slave channel. It is possible that virtqueue shutodwn is successful
> > but other entity could still be sending messages on slave channel. In
> > that case, shall we say device shutdown complete and stop polling
> > slave channel?
> > 
> > IOW, the way we have explicit protocol messages to shutdown each
> > vring, it makes sense to have explicit protocol message to shutdown
> > slave channel as well. Right now there is no mechanism to do that.
> > 
> 
> Thinking more from "slave" point of view. It might have a separate thread
> to send messages on slave channel. It has no idea when to stop sending
> messages on slave channel, as there is no mechanism to shutdown slave.
> Only method seems to be that master will close slave fd and that
> will close connection.
> 
> That means if a device is being stopped (and not removed), then it
> is possible virtqueues got stopped but a thread in slave is blocked
> on slave channel (recvmsg()) and expecting a response back.
> 
> That's why I think enhancing protocol to explicitly shutdown slave
> channel makes sense. Instead of assuming that stopping a virtuqueue
> should automatically mean slave channel is stopped.

This deadlock can occur during any vhost-user protocol message, not just
shutdown:

When QEMU's vhost-user master implementation sends a vhost-user protocol
message, vhost_user_read() does a "blocking" read during which slave_fd
is not monitored by QEMU.

So the problem is more general and is not fully solved by an explicit
shutdown message. If the slave needs slave channel reply in order to
process a vhost-user protocol message then a deadlock can occur.

This can be solved by monitoring both the vhost-user fd and slave_fd in
vhost_user_read(). I don't think there is a need to shutdown the slave
channel BTW.

Stefan

signature.asc
Description: PGP signature

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH 4/6] qemu, vhost-user: Extend protocol to start/stop/flush slave channel, Stefan Hajnoczi <=

Prev by Date: Re: [PATCH v2 63/93] tcg/tci: Use ffi for calls
Next by Date: Re: [PATCH v2 01/15] tcg: Change parameters for tcg_target_const_match
Previous by thread: Re: [Virtio-fs] [PATCH v2 2/2] viriofsd: Add support for FUSE_HANDLE_KILLPRIV_V2
Next by thread: Re: [PATCH] hw/block/nvme: add broadcast nsid support flush command
Index(es):
- Date
- Thread