qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 11/18] vhost-user: add shutdown support


From: Yuanhan Liu
Subject: Re: [Qemu-devel] [PATCH 11/18] vhost-user: add shutdown support
Date: Mon, 2 May 2016 10:37:51 -0700
User-agent: Mutt/1.5.23 (2014-03-12)

On Mon, May 02, 2016 at 01:45:31PM +0300, Michael S. Tsirkin wrote:
> On Sun, May 01, 2016 at 02:04:42PM -0700, Yuanhan Liu wrote:
> > On Sun, May 01, 2016 at 02:37:19PM +0300, Michael S. Tsirkin wrote:
> > > On Fri, Apr 29, 2016 at 10:48:35AM -0700, Yuanhan Liu wrote:
> > > > > But, I
> > > > > would worry first about a backend that crashes that it may corrupt the
> > > > > VM memory too...
> > > > 
> > > > Not quite sure I follow this. But, backend just touches the virtio
> > > > related memory, so it will do no harm to the VM?
> > > 
> > > It crashed so apparently it's not behaving as designed -
> > > how can we be sure it does not harm the VM?
> > 
> > Hi Michael,
> > 
> > What I know so far, a crash might could corrupt the virtio memory in two
> > ways:
> > 
> > - vring_used_elem might be in stale state when we are in the middle of
> >   updating used vring. Say, we might have updated the "id" field, but
> >   leaving "len" untouched.
> > 
> > - vring desc buff might also be in stale state, when we are copying
> >   data into it.
> 
> 
> - a random write into VM memory due to backend bug corrupts it.
> 
> > However, the two issues will not be real issue, as used->idx is not
> > updated in both case. Thefore, those corrupted memory will not be
> > processed by guest. So, no harm I'd say. Or, am I missing anything?
> 
> Why is backend crashing? It shouldn't so maybe this means it's
> memory is corrupt. That is the claim.

As far as virtio memory is not corrupted (or even corrupt in above
ways), there would be no issue. But, yes, you made a good point: we
make no guarantees that it's not the virtio memory corruption caused
the crash.

> > BTW, Michael, what's your thoughts on the whole reconnect stuff?
> > 
> >     --yliu
> 
> My thoughts are that we need to split these patchsets up.
> 
> There are several issues here:
> 
> 
> 1. Graceful disconnect
> - One should be able to do vmstop, disconnect, connect then vm start
>   This looks like a nice intermediate step.
> - Why would we always assume it's always remote initiating the disconnect?
>   Monitor commands for disconnecting seem called for.

A monitor command is a good suggestion. But I'm thinking why vmstop is
necessary. Basically, a disconnect is as to a cable unplug to physical
NIC; we don't need pause it before unplugging.

> 
> 2. Unexpected disconnect
> - If buffers are processed in order or flushed before socket close,
>   then on disconnect status can be recovered
>   from ring alone. If that is of interest, we might want to add
>   a protocol flag for that. F_DISCONNECT_FLUSH ?

Sorry, what does this flag supposed to work here?

> Without this,
>   only graceful disconnect or reset with guest's help can be supported.
> - As Marc-André said, without graceful shutdown it is not enough to
>   handle ring state though.  We must handle socket getting disconnected
>   in the middle of send/receive.  While it's more work,
>   it does seem like a nice interface to support.

Same as above, what the backend (or QEMU) can do for this case without
the guest's (reset) help?


> - I understand the concern that existing guests do not handle NEED_RESET
>   status. One way to fix this would be a new feature bit. If NEED_RESET not
>   handled,

I could check the code by myself, but I'm thinking it might be trivial
for you to answer my question: how do we know that NEED_RESET is not
handled?

> request hot-unplug instead.

Same as above, may I know how to request a hot-unplug?

> 
> 3. Running while disconnected
> - At the moment, we wait on vm start for remote to connect,
>   if we support disconnecting backend without stopping
>   we probably should also support starting it without waiting
>   for connection

Agreed. I guess that would anaswer my first question: we don't actually
need to do vmstop before disconnect.

        --yliu

> - Guests expect tx buffers to be used in a timely manner, thus:
> - If vm is running we must use them in qemu, maybe discarding packets
>   in the process.
>   There already are commands for link down, I'm not sure there's value
>   in doing this automatically in qemu.
> - Alternatively, we should also have an option to stop vm automatically (like 
> we do on
>   disk errors) to reduce number of dropped packets.
> 
> 4. Reconnecting
> - When acting as a server, we might want an option to go back to
>   listening state, but it should not be the only option,
>   there should be a monitor command for moving it back to
>   listening state.
> - When acting as a client, auto-reconnect might be handy sometimes, but 
> should not be the only
>   option, there should be a way to manually request connect, possibly to
>   a different target, so a monitor command for re-connecting seems called for.
> - We'll also need monitor events for disconnects so management knows it
>   must re-connect/restart listening.
> - If we stopped VM, there could be an option to restart on reconnect.
> 
> 
> -- 
> MST



reply via email to

[Prev in Thread] Current Thread [Next in Thread]