qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 01/18] vfio/migration: Add VFIO migration pre-copy support


From: Jason Gunthorpe
Subject: Re: [PATCH 01/18] vfio/migration: Add VFIO migration pre-copy support
Date: Wed, 1 Feb 2023 16:10:50 -0400

On Wed, Feb 01, 2023 at 11:42:46AM -0700, Alex Williamson wrote:

> > 'p2p off' is a valuable option in its own right because this stuff
> > doesn't work reliably and is actively dangerous. Did you know you can
> > hard crash the bare metal from a guest on some platforms with P2P
> > operations? Yikes. If you don't need to use it turn it off and don't
> > take the risk.
> 
> If we're honest, there are a number of cases of non-exceptional faults
> that an assigned device can generate that the platform might escalate
> to fatal errors.

What I understand is that is true on some commodity hardware, but
engineered systems to run as cloud hypervisors have these problems
solved and VFIO is made safe.

Unfortunately there is no way to know if you have a safe or unsafe
system from the OS.

> > Arguably for this reason 'p2p off' should trend toward the default as
> > it is much safer.
> 
> Safety in the hands of the userspace to protect the host though?
> Shouldn't the opt-in be at the kernel level whether to allow p2p
> mappings?  

I haven't seen anyone interested in doing this kind of work. The
expectation seems to be that places seriously concerned about security
either don't include VFIO at all in their environments or have
engineered their platforms to make it safe.

Where this leaves the enterprise space, I don't know. I think they end
up with systems that functionally work but possibly have DOS problems.

So, given this landscape I think a user option in qemu is the best we
can do at the moment.

> I don't have an issue if QEMU were to mirror this by
> creating a RAM-only AddressSpace for devices which would be used when
> p2p is disable (it'd save us some headaches for various unaligned
> devices as well), but we shouldn't pretend that actually protects the
> host.  OTOH, QEMU could feel confident supporting migration of devices
> w/o support of the migration P2P states with that restriction.

It protects the host from a hostile VM, it does not fully protect the
host from a compromised qemu. That is still an improvement.

> > I think multi-device will likely have some use cases, so I'd like to
> > see a path to have support for them. For this series I think it is
> > probably fine since it is already 18 patches.
> 
> It might be fine for this series because it hasn't yet proposed to make
> migration non-experimental, but it's unclear where the goal post is
> that we can actually make that transition.

IMHO non-experimental just means the solution works with whatever
configuration limitations it comes with. It shouldn't mean every
device or every configuration combination works.

So if you want to do single device, or just hard require P2P for now,
those are both reasonable temporary choices IMHO.

But they are temporary and we should come with a remedy to allow
non-P2P migration devices to work as well.

Given we merged a non-P2P kernel driver I prefer the single device
option as it is more logically consistent with the kernel situation.

Jason



reply via email to

[Prev in Thread] Current Thread [Next in Thread]