qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 0/3] vhost-user reconnect


From: Yury Kotov
Subject: Re: [Qemu-devel] [PATCH 0/3] vhost-user reconnect
Date: Mon, 20 Aug 2018 16:39:15 +0300


20.08.2018, 16:11, "Marc-André Lureau" <address@hidden>:
> Hi
>
> On Mon, Aug 20, 2018 at 2:51 PM, Yury Kotov <address@hidden> wrote:
>>  16.08.2018, 18:36, "Marc-André Lureau" <address@hidden>:
>>>  On Thu, Aug 16, 2018 at 5:32 PM, Yury Kotov <address@hidden> wrote:
>>>>   We are using QEMU (2.12.0) with SPDK (18.04.1) over vhost-user to 
>>>> emulate block
>>>>   devices. One of our cases it to restart SPDK without restarting VM (in 
>>>> case
>>>>   of some updates or smth like it). We tried to use the 'reconnect' option 
>>>> for
>>>>   the '-chardev' device:
>>>>     -object 
>>>> memory-backend-file,id=mem0,size=1G,mem-path=/dev/hugepages,share=on \
>>>>     -numa node,memdev=mem0 \
>>>>     -chardev socket,id=spdk_vhost_blk1,path=/var/tmp/vhost.1,reconnect=10 \
>>>>     -device vhost-user-blk-pci,chardev=spdk_vhost_blk1,num-queues=4
>>>>
>>>>   After this, vhost-user-blk initialization fails with an error below:
>>>>     qemu-system-x86_64: -device ...: Failed to set msg fds.
>>>>     qemu-system-x86_64: -device ...: vhost-user-blk: vhost initialization 
>>>> failed:
>>>>                                      Operation not permitted
>>>>
>>>>   We got the same error with the latest QEMU (c542a9f9794ec8e0bc3f).
>>>>
>>>>   We made some investigations and found out that there are several issues:
>>>>
>>>>   1. Reconnect option postpones the first connection till machine init 
>>>> done event.
>>>>      But we need this connection during vhost blk device initialization 
>>>> which
>>>>      happens before the machine init done handling.
>>>>
>>>>   2. If the connection is forced, then the reconnection will be successful
>>>>      after SPDK restart. The problem is that virtual queue will not start.
>>>>      The reason for it is that virtual queue initialization commands
>>>>      should be resent:
>>>>      * VHOST_USER_SET_FEATURES
>>>>      * VHOST_USER_SET_MEM_TABLE
>>>>      * VHOST_USER_SET_VRING_NUM
>>>>      * VHOST_USER_SET_VRING_BASE
>>>>      * VHOST_USER_SET_VRING_ADDR
>>>>      * VHOST_USER_SET_VRING_KICK
>>>>      * VHOST_USER_SET_VRING_CALL
>>>>
>>>>   The patch set resolves both of these issues.
>>>>
>>>>   Test case:
>>>>
>>>>   1. Start fio process (inside VM):
>>>>        fio --name test --ioengine=libaio --iodepth=64 --bs=4096 \
>>>>            --rw=randrw --direct=1 --sync=1 --verify=md5 \
>>>>            --size=64M --filename=/dev/vda --loops=100
>>>>
>>>>   2. Restart SPDK many times.
>>>>      We are expecting that during SPDK restart fio will pause and fio 
>>>> should
>>>>      continue to work after restart completion.
>>>>
>>>>   3. fio process completed successfully without any error.
>>>
>>>  Can you write a test case in vhost-user-test.c ? (perhaps under
>>>  QTEST_VHOST_USER_FIXME scope...)
>>
>>  This is a great idea and we were definitely going to do that during coming 
>> couple of weeks. We thought that we could make a follow up commit with 
>> necessary tests added a bit later though, since currently we need to figure 
>> out the state of vhost-user tests in general, before we can try to add any 
>> new stuff, and that will take some time. So far we have stress-tested these 
>> fixes manually.
>
> Yes, some vhost-user tests are disabled by default (sadly for travis
> CI reason - not a really bug), and it's easy to introduce regressions.
>
> I sent a related series "[PATCH 0/4] Fix socket chardev regression" to
> make it work again.
>
>>  Do you suggest we wait with this series as well until we have all tests 
>> ready? Or do we proceed now and make a follow up series with vhost user 
>> tests later like we suggested?
>
> I would rather have the tests with the series.
>

Sounds good. We will resend v2 with tests. However while we do that, we would be
grateful for more comments on current implementation as well, since it at least
passes our internal functional tests.

>>>>   Yury Kotov (3):
>>>>     chardev: prevent extra connection attempt in tcp_chr_machine_done_hook
>>>>     vhost: refactor vhost_dev_start and vhost_virtqueue_start
>>>>     vhost-user: add reconnect support for vhost-user
>>>>
>>>>    chardev/char-socket.c | 5 +-
>>>>    hw/virtio/vhost-user.c | 65 ++++++++++++--
>>>>    hw/virtio/vhost.c | 223 +++++++++++++++++++++++++++++++---------------
>>>>    include/hw/virtio/vhost.h | 2 +
>>>>    4 files changed, 215 insertions(+), 80 deletions(-)
>>>>
>>>>   --
>>>>   2.7.4



reply via email to

[Prev in Thread] Current Thread [Next in Thread]