qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v8 11/12] vfio: register aer resume notification


From: Chen Fan
Subject: Re: [Qemu-devel] [PATCH v8 11/12] vfio: register aer resume notification handler for aer resume
Date: Tue, 21 Jun 2016 20:41:32 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0

On 2016年06月21日 11:13, Alex Williamson wrote:
On Tue, 21 Jun 2016 10:16:25 +0800
Zhou Jie <address@hidden> wrote:

Hi, Alex

I was really hoping to hear your opinion, or at least some further
discussion of pros and cons rather than simply parroting back my idea.
I understand.

My current thinking is that a resume notifier to userspace is poorly
defined, it's not clear what the user can and cannot do between an
error notification and the resume notification.
Yes, do nothing between that time is better.

One approach to solve
that might be that the kernel internally handles the resume
notifications.  Maybe that means blocking the ioctl (interruptible
timeout) until the internal resume occurs, or maybe that means
returning -EAGAIN.
I don't think it is a good idea.
The kernel give the error and resume notifications, it's enough.
It's up to user to how to use them.
Well that's exactly why it's poorly defined.  What does a resume
notification signal a user that they're allowed to do?  What can they
not do between error and resume notification.  Clearly you had issues
attempting to perform a reset during this time period since it was
racing with the kernel reset, so is a user allowed to do a hot reset
between error and resume?  Where do we define it?  Do we prevent it if
they try?  Why?  What about the reset ioctl?  How and why is that
different from a hot reset?  (hint, they can be the same)  Do we define
that resets are not allowed between error and resume, but other
operations like read/write or interrupt setup ioctls are allowed? Why?
Clearly we can't do anything that manipulates the device between error
and resume since it might be lost or ineffective, but where do we
define it and do we need to actively enforce those rules?  I'm arguing
that it's poorly defined, so "it's up to the user how to use them"
doesn't not give me any additional confidence in that approach.  We
can't trust the user to be polite, we can't even trust the user not to
be malicious.
Hi Alex,
on kernel side, I think if we don't trust the user behaviors, we should disable the access of vfio-pci interface once vfio-pci driver got the error_detected,
 we should disable all access to vfio fd regardless whether the vfio-pci
 was assigned to a VM, we also can return a EAGAIN error if user try
 to access it during the reset period until the host reset finished.
     on qemu side, when we got a error_detect, we pass through the
aer error to guest directly, ignore all access to vfio-pci during this time,
when qemu need to do a hot reset, we can retry to get the info from
the get info ioctl until we got the info that vfio-pci has been reset finished, then do the hot_reset ioctl if need, the kernel should ensure the ioctl become
//// accessible after host reset completed.

Thanks,
Chen


Probably implementations of each need to be worked
through to determine which is better.  We don't want to add complexity
to the kernel simply to make things easier for userspace, but we also
don't want a poorly specified interface that is difficult for
userspace to use correctly.  Thanks,
In qemu, the aer recovery process:
    1. Detect support for resume notification
       If host vfio driver does not support for resume notification,
       directly fail to boot up VM as with aer enabled.
    2. Immediately notify the VM on error detected.
    3. Disable the device.
       Unmap the config space and bar region.
    4. Delay the guest directed bus reset.
    5. Wait for resume notification.
       If we don't get the resume notification from the host after
       some timeout, we would abort the guest directed bus reset
       altogether and unplug of the device to prevent it from further
       interacting with the VM.
    6. After get the resume notification reset bus and enable the device.

I think we only make sure the disabled device
   will not interact with the VM.
Should interrupt irqfds then also be disabled so they trap into QEMU
and we can prevent that interaction?  Also, QEMU can be polite, but as
above, QEMU is just one user, the API is open to anyone and QEMU might
be exploited to not be so polite.  So if there are points where the
user can interfere with the kernel or exploit the knowledge that the
device is going through a reset, the kernel can't rely on a friendly
user.  Thanks,

Alex


--
Sincerely,
Chen Fan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]