On Mon, 6 Feb 2023 at 07:36, Hanna Czenczek <hreitz@redhat.com> wrote:
Hi Stefan,
For true virtio-fs migration, we need to migrate the daemon’s (back
end’s) state somehow. I’m addressing you because you had a talk on this
topic at KVM Forum 2021. :)
As far as I understood your talk, the only standardized way to migrate a
vhost-user back end’s state is via dbus-vmstate. I believe that
interface is unsuitable for our use case, because we will need to
migrate more than 1 MB of state. Now, that 1 MB limit has supposedly
been chosen arbitrarily, but the introducing commit’s message says that
it’s based on the idea that the data must be supplied basically
immediately anyway (due to both dbus and qemu migration requirements),
and I don’t think we can meet that requirement.
Yes, dbus-vmstate is the available today. It's independent of
vhost-user and VIRTIO.
Has there been progress on the topic of standardizing a vhost-user back
end state migration channel besides dbus-vmstate? I’ve looked around
but didn’t find anything. If there isn’t anything yet, is there still
interest in the topic?
Not that I'm aware of. There are two parts to the topic of VIRTIO
device state migration:
1. Defining an interface for migrating VIRTIO/vDPA/vhost/vhost-user
devices. It doesn't need to be implemented in all these places
immediately, but the design should consider that each of these
standards will need to participate in migration sooner or later. It
makes sense to choose an interface that works for all or most of these
interfaces instead of inventing something vhost-user-specific.
2. Defining standard device state formats so VIRTIO implementations
can interoperate.
Of course, we could use a channel that completely bypasses qemu, but I
think we’d like to avoid that if possible. First, this would require
adding functionality to virtiofsd to configure this channel. Second,
not storing the state in the central VM state means that migrating to
file doesn’t work (well, we could migrate to a dedicated state file,
but...). Third, setting up such a channel after virtiofsd has sandboxed
itself is hard. I guess we should set up the migration channel before
sandboxing, which constrains runtime configuration (basically this would
only allow us to set up a listening server, I believe). Well, and
finally, it isn’t a standard way, which won’t be great if we’re planning
to add a standard way anyway.
Yes, live migration is hard enough. Duplicating it is probably not
going to make things better. It would still be necessary to support
saving to file as well as live migration.
There are two high-level approaches to the migration interface:
1. The whitebox approach where the vhost-user back-end implements
device-specific messages to get/set migration state (e.g.
VIRTIO_FS_GET_DEVICE_STATE with a struct virtio_fs_device_state
containing the state of the FUSE session or multiple fine-grained
messages that extract pieces of state). The hypervisor is responsible
for the actual device state serialization.
2. The blackbox approach where the vhost-user back-end implements the
device state serialization itself and just produces a blob of data.
My suggestion is blackbox migration with a full iterative interface.
The reason I like the blackbox approach is that a device's device
state is encapsulated in the device implementation and does not
require coordinating changes across other codebases (e.g. vDPA and
vhost kernel interface, vhost-user protocol, QEMU, etc). A blackbox
interface only needs to be defined and implemented once. After that,
device implementations can evolve without constant changes at various
layers.
So basically, something like VFIO v2 migration but for vhost-user
(with an eye towards vDPA and VIRTIO support in the future).
Should we schedule a call with Jason, Michael, Juan, David, etc to
discuss further? That way there's less chance of spending weeks
working on something only to be asked to change the approach later.