[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH RFC 00/14] vhost-user: shutdown and reconnection

From: Tetsuya Mukawa
Subject: Re: [Qemu-devel] [PATCH RFC 00/14] vhost-user: shutdown and reconnection
Date: Mon, 28 Mar 2016 10:53:11 +0900
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0

On 2016/03/26 3:00, Marc-André Lureau wrote:
> Hi
> On Thu, Mar 24, 2016 at 8:10 AM, Yuanhan Liu
> <address@hidden> wrote:
>>>> The following series starts from the idea that the slave can request a
>>>> "managed" shutdown instead and later recover (I guess the use case for
>>>> this is to allow for example to update static dispatching/filter rules
>>>> etc)
>> What if the backend crashes, that no such request will be sent? And
>> I'm wondering why this request is needed, as we are able to detect
>> the disconnect now (with your patches).
> I don't think trying to handle backend crashes is really a thing we
> need to take care of. If the backend is bad enough to crash, it may as
> well corrupt the guest memory (mst: my understanding of vhost-user is
> that backend must be trusted, or it could just throw garbage in the
> queue descriptors with surprising consequences or elsewhere in the
> guest memory actually, right?).
>> BTW, you meant to let QEMU as the server and the backend as the client
>> here, right? Honestly, that's what we've thought of, too, in the first
>> time.
>> However, I'm wondering could we still go with the QEMU as the client
>> and the backend as the server (the default and the only way DPDK
>> supports), and let QEMU to try to reconnect when the backend crashes
>> and restarts. In such case, we need enable the "reconnect" option
>> for vhost-user, and once I have done that, it basically works in my
>> test:
> Conceptually, I think if we allow the backend to disconnect, it makes
> sense that qemu is actually the socket server. But it doesn't matter
> much, it's simple to teach qemu to reconnect a timer... So we should
> probably allow both cases anyway.
>> - start DPDK vhost-switch example
>> - start QEMU, which will connect to DPDK vhost-user
>>   link is good now.
>> - kill DPDK vhost-switch
>>   link is broken at this stage
>> - start DPDK vhost-switch again
>>   you will find that the link is back again.
>> Will that makes sense to you? If so, we may need do nothing (or just
>> very few) changes at all to DPDK to get the reconnect work.
> The main issue with handling crashes (gone at any time) is that the
> backend my not have time to sync the used idx (at the least). It may
> already have processed incoming packets, so on reconnect, it may
> duplicate the receiving/dispatching work. Similarly, on the backend
> receiving end, some packets may be lost, never received by the VM, and
> later overwritten by the backend after reconnect (for the same used
> idx update reason). This may not be a big deal for unreliable
> protocols, but I am not familiar enough with network usage to know if
> that's fine in all cases. It may be fine for some packets, such as
> udp.
> However, in general, vhost-user should not be specific to network
> transmission, and it would be nice to have a reliable way for the the
> backend to reconnect. That's what I try to do in this series. I'll
> repost it after I have done more testing.
> thanks

Hi Yuanhan,

Probably, we have 2 options here.
One is using DEVICE_NEEDS_RESET, or adding one more new status like
QUEUE_NEEDS_RESET to virtio specification.
In this case, we will need to fix virtio-net drivers and virtio-net
device of QEMU, so it might need to fix a lot of code, but we can handle
unexpected shutdown of vhost-user backend.
The other option is Marc's simple solution. In this case, we don't need
to change virtio-net drivers, but we cannot handle unexpected shutdown.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]