[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/3] virtio-net: graceful drop of vhost for TAP

From: Jason Wang
Subject: Re: [PATCH 0/3] virtio-net: graceful drop of vhost for TAP
Date: Thu, 18 Feb 2021 11:02:17 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0

On 2021/2/10 下午4:38, Michael S. Tsirkin wrote:
On Wed, Feb 10, 2021 at 02:19:59PM +0800, Jason Wang wrote:
On 2021/2/9 下午11:04, Michael S. Tsirkin wrote:
On Tue, Feb 09, 2021 at 02:51:05PM +0000, Daniel P. Berrangé wrote:
On Tue, Feb 09, 2021 at 09:34:20AM -0500, Michael S. Tsirkin wrote:
On Thu, Feb 04, 2021 at 10:29:12PM +0200, Yuri Benditovich wrote:
This set of patches introduces graceful switch from tap-vhost to
tap-no-vhost depending on guest features. Before that the features
that vhost does not support were silently cleared in get_features.
This creates potential problem of migration from the machine where
some of virtio-net features are supported by the vhost kernel to the
machine where they are not supported (packed ring as an example).
I still worry that adding new features will silently disable vhost for people.
Can we limit the change to when a VM is migrated in?
Some management applications expect bi-directional live migration to
work, so taking specific actions on incoming migration only feels
Could you be more specific?

Bi-directional migration is currently broken
when migrating new kernel->old kernel.

This seems to be the motivation for this patch, though I wish
it was spelled out more explicitly.

People don't complain much, but I'm fine with fixing that
with a userspace fallback.

I'd rather not force the fallback on others though: vhost is generally
specified explicitly by user while features are generally set
automatically, so this patch will make us override what user specified,
not nice.

IMHO if the features we're adding cannot be expected to exist in
host kernels in general, then the feature should defualt to off
and require explicit user config to enable.
Downstream distros which can guarantee newer kernels can flip the
default in their custom machine types if they desire.

Unfortunately that will basically mean we are stuck with no new features
for years. We did what this patch is trying to change for years now, in
particular KVM also seems to happily disable CPU features not supported
by kernel so I wonder why we can't keep doing it, with tweaks for some
corner cases.

It's probably not the corner case.

So my understanding is when a feature is turned on via command line,

Most people do not play with these flags on command line, the
main path to turn features on if when they default to on.

The problem is qemu may choose to clear feature whose default are on.

should not be cleared silently otherwise we may break migration for sure.
Not if we are careful. Setting flags is more dangerous.

The problem is that user is expected to enable the feature for the device. If Qemu fail to do that, shouldn't we fail the device initialization?

Clearing is
generally ok.

E.g when packed=on is specified, we should disable vhost instead of clear it
from the device.

 From usability POV, consider packed as an example, people only enable it
to get better performance. libvirt says:

                The attribute packed controls if QEMU should try to use packed
        virtqueues. Compared to regular split queues, packed queues consist of
        only a single descriptor ring replacing available and used ring, index
        and descriptor buffer. This can result in better cache utilization and
        performance. If packed virtqueues are actually used depends on the
        feature negotiation between QEMU, vhost backends and guest drivers.
        Possible values are on or off.

Switching to a completely different backend clearly isn't what user intended.

If we don't do migration, all should be fine. But the problem is the migration compatibility. My understanding is that libvirt will assume the compatibility if the same user-visible feature were set through cli.  So if any feature were cleared the migration compatibility will break. This will also be a burden to support cross vendor live migration in the future for vDPA.


userspace and kernel not being in 100% sync wrt features is not
a corner case though, and switching backends seems like too big
a hammer.

|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

reply via email to

[Prev in Thread] Current Thread [Next in Thread]