qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [Qemu-devel] [PATCH RFC v2 1/2] Add param Error** to ms


From: Markus Armbruster
Subject: Re: [Qemu-block] [Qemu-devel] [PATCH RFC v2 1/2] Add param Error** to msi_init() & modify the callers
Date: Fri, 04 Mar 2016 09:42:02 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

"Michael S. Tsirkin" <address@hidden> writes:

> On Thu, Mar 03, 2016 at 04:03:16PM +0100, Markus Armbruster wrote:
>> "Michael S. Tsirkin" <address@hidden> writes:
>> 
>> > On Thu, Mar 03, 2016 at 01:19:09PM +0200, Marcel Apfelbaum wrote:
>> >> On 03/03/2016 12:45 PM, Michael S. Tsirkin wrote:
>> >> >On Thu, Mar 03, 2016 at 12:12:27PM +0200, Marcel Apfelbaum wrote:
>> >> >>>>+int msi_init(struct PCIDevice *dev, uint8_t offset, unsigned int 
>> >> >>>>nr_vectors,
>> >> >>>>+             bool msi64bit, bool msi_per_vector_mask, Error **errp)
>> >> >>>>  {
>> >> >>>>      unsigned int vectors_order;
>> >> >>>>-    uint16_t flags;
>> >> >>>>+    uint16_t flags; /* Message Control register value */
>> >> >>>>      uint8_t cap_size;
>> >> >>>>      int config_offset;
>> >> >>>>
>> >> >>>>      if (!msi_supported) {
>> >> >>>>+        error_setg(errp, "MSI is not supported by interrupt 
>> >> >>>>controller");
>> >> >>>>          return -ENOTSUP;
>> >> >>>
>> >> >>>First failure mode: board does not support MSI (-ENOTSUP).
>> >> >>>
>> >> >>>Question to the PCI guys: why is this even an error?  A device with
>> >> >>>capability MSI should work just fine in such a board.
>> >> >>
>> >> >>Hi Markus,
>> >> >>
>> >> >>Adding Jan Kiszka, maybe he can help.
>> >> >>
>> >> >>That's a fair question. Is there any history for this decision?
>> >> >>The board not supporting MSI has nothing to do with the capability 
>> >> >>being there.
>> >> >>The HW should not change because the board doe not support it.
>> >> >>
>> >> >>The capability should be present but not active.
>> >> >
>> >> >Digging in git log will tell you why we have the msi_supported flag:
>> >> >
>> >> >commit 02eb84d0ec97f183ac23ee939403a139e8849b1d ("qemu/pci: MSI-X 
>> >> >support functions")
>> >> >
>> >> > This is a safety measure to avoid breaking platforms which should 
>> >> > support
>> >> > MSI-X but currently lack this in the interrupt controller emulation.
>> >> >
>> >> >in other words, on some platforms we must hide msi support from devices
>> >> >because otherwise guests will try to use it, and our emulation is
>> >> >incomplete.
>> >> 
>> >> 
>> >> OK, thanks. So the flag should be "msi_broken" or 
>> >> "msi_present_but_not_implemented" and not
>> >> "msi_supported" that leads (at least me) to the assumption that some 
>> >> platform *does not support msi*
>> >> rather than it supports it, but we don't emulate it.
>> 
>> I agree the name is badly misleading for this role.
>> 
>> Now let me see how this contraption actually works.  msi_supported is
>> global, initialized to false, and becomes globally true when
>> 
>> 1. certain MSI-capable interrupt controllers realize: "apic",
>>   "kvm-apic" if kvm_has_gsi_routing(), "xen-apic", "arm-gicv2m",
>>   "openpic" models 1 and 2, "kvm-openpic" models 1 and 2
>> 
>> 2. "s390-pcihost" class-initializes
>> 
>> 3. machine "spapr-machine" initializes
>> 
>> Issues:
>> 
>> * "Global is problematic.  What if a board has more than one interrupt
>>   controller?  What if one of them sets msi_supported, but the other one
>>   is of the kind Michael described, i.e. guests know it has MSI, but our
>>   emulation doesn't actually work?
>> 
>> * "Initialize to false" is problematic.  We don't clear msi_supported
>>   when we have a broken interrupt controler, we set it when we have a
>>   working one.  The consequence is that boards with non-MSI interrupt
>>   controllers are treated just like boards with broken interrupt
>>   controllers.
>> 
>>   Here's  how msi_supported is documented:
>> 
>>     /* Flag for interrupt controller to declare MSI/MSI-X support */
>>     bool msi_supported;
>> 
>>   This is matches how the code works.  However, it contradicts the
>>   commit message Michael quoted.  The most plausible explanation is that
>>   the commit is flawed.
>> 
>> * Class-initialize (2.) looks wrong to me.  msi_supported becomes true
>>   when QOM type "s390-pcihost" is created, regardless of whether
>>   instances get created, let alone used.
>> 
>> * I'm not sure about 3., but the spapr guys can worry about that.
>> 
>> >> >And the conclusion from that is that for msi_init to fail silently is
>> >> >at the moment the right thing to do.
>> >> 
>> >> But this is not the only thing we do, we are modifying the PCI devices. 
>> >> We could fail starting the VM
>> >> if a device supporting MSI is added on a platform with broken msi, but 
>> >> this will prevent us to use
>> >> assigned devices. Emulated devices should be created with a specific 
>> >> "msi=off" flag.
>> >> 
>> >> Thanks,
>> >> Marcel
>> >
>> > That will just break a bunch of valid configurations, for no real
>> > benefit to users.
>> 
>> I disagree, strongly.
>> 
>> If I ask for msi=on, then QEMU should either give it to me or fail, not
>> silently "correct" my configuration.
>> 
>> Of course, if we have msi_supported = false for everybody and its dog
>> whether it's really needed or not, "or fail" will indeed reject "a bunch
>> of valid configurations".  We "compensate" by silently messing with the
>> user's configuration.  That's shoddy workmanship, sorry.
>> 
>> We should instead have msi_supported = false only when it's actually
>> needed, and thus reject exactly the configurations that won't work.
>
> Most people don't care one way or the other.  We started without msi.
> We later added msi for some boards only.  Asking everyone to add msi=off
> for all other boards at that point would have been silly.

I'm afraid you're missing my point entirely.

Yes, we started without MSI.  We then added MSI support to some devices
and to some boards.  Exactly the same history as for physical hardware.

Plugging an MSI-capable device into an MSI-incapable board works just
fine, both for physical and for virtual hardware.  What doesn't work is
plugging an MSI-capable device into an MSI-capable board with *broken*
MSI support.

As a convenience feature, we summarily and silently remove a device's
MSI capability when we detect such a broken board.  At least that's what
the commit message you quoted claims.

In reality, we remove it not just for broken boards, but even for
MSI-incapable boards.

I take issue with "summarily and silently", and "even for MSI-incapable
boards".

When multiple variants of a device exist, and the user didn't ask for a
specific one, then picking the one that works best with the board is
just fine.

It's absolutely not fine when the user did ask for a specific one.  When
I ask for msi=on, I mean it.  If it can't work with this board, tell me.
But silently ignoring my orders is a bug.

It's fine to have emulations of MSI-capable boards where MSI doesn't yet
work.  Even if that means we have to reject MSI-capable devices.

It's absolutely not fine to reject them for MSI-incapable boards, where
they'd work just fine.

Furthermore, I doubt the wisdom of creating virtual devices that don't
exist physically just to have something that works in our broken boards.
If the physical device exists in MSI-capable and MSI-incapable variants,
emulating both is fine.  But if it only ever existed MSI-capable, having
the PCI core(!) drop the MSI capability is a questionable idea.  I
suspect that the need for this dubious hack would be much smaller if we
didn't foolishly treat every MSI-incapable board as broken MSI-capable
board.

If you conclude that cleaning up this carelessly made mess is not worth
the bother now, then at least explain the mess in the code, please.
It's not obvious, and figuring out what's going on and why it is the way
it is has taken me several hours, and Marcel's help.

[...]



reply via email to

[Prev in Thread] Current Thread [Next in Thread]