qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] hw/pci: do not update the PCI mappings while De


From: Laszlo Ersek
Subject: Re: [Qemu-devel] [PATCH] hw/pci: do not update the PCI mappings while Decode (I/O or memory) bit is not set in the Command register
Date: Mon, 11 Jan 2016 19:44:18 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0

On 01/11/16 19:01, Marcel Apfelbaum wrote:
> On 01/11/2016 07:15 PM, Laszlo Ersek wrote:
>> On 01/11/16 17:34, Marcel Apfelbaum wrote:
>>> On 01/11/2016 06:11 PM, Laszlo Ersek wrote:
>>>> On 01/11/16 13:24, Marcel Apfelbaum wrote:
>>>>> Two reasons:
>>>>>    - PCI Spec indicates that while the bit is not set
>>>>>      the memory sizing is not finished.
>>>>>    - pci_bar_address will return PCI_BAR_UNMAPPED
>>>>>      and a previous value can be accidentally overridden
>>>>>      if the command register is modified (and not the BAR).
>>>>>
>>>>> Signed-off-by: Marcel Apfelbaum <address@hidden>
>>>>> ---
>>>>>
>>>>> Hi,
>>>>>
>>>>> I found this when trying to use multiple root complexes with OVMF.
>>>>>
>>>>> When trying to attach a device to the pxb-pcie device as Integrated
>>>>> Device it did not receive the IO/MEM resources.
>>>>>
>>>>> The reason is that OVMF is working like that:
>>>>>    1. It disables the Decode (I/O or memory) bit in the Command
>>>>> register
>>>>>    2. It configures the device BARS
>>>>>    3. Makes some tests on the Command register
>>>>>    4. ...
>>>>>    5. Enables the Decode (I/O or memory) at some point.
>>>>>
>>>>> On step 3 all the BARS are overridden to 0xffffffff by QEMU.
>>>>>
>>>>> Since QEMU uses the device BARs to compute the new host bridge
>>>>> resources
>>>>> it now gets garbage.
>>>>>
>>>>> Laszlo, this also solves the SHPC problem for the pci-2-pci bridge
>>>>> inside the pxb.
>>>>> Now we can enable the SHPC for it too.
>>>>
>>>> I encountered the exact same problem months ago. I posted patches for
>>>> it; you were CC'd. :)
>>>>
>>>> http://thread.gmane.org/gmane.comp.emulators.qemu/342206/focus=342209
>>>> http://thread.gmane.org/gmane.comp.emulators.qemu/342206/focus=342210
>>>>
>>>> As you can see under the second link above, I made the same analysis &
>>>> observations as you do now. (It took me quite long to track down the
>>>> "inexplicable" behavior of edk2's generic PCI bus driver / enumerator
>>>> that is built into OVMF.)
>>>
>>> Wow, I just re-worked this issue again from 0! I wish I have remembered
>>> those threads :(
>>> This was another symptom of the exact problem! And I remembered
>>> something about
>>> SHPC, I should have looked at those mail threads again...
>>>
>>>>
>>>> I proposed to change pci_bar_address() so that it could return, to
>>>> distinguished callers, the BAR values "under programming", even if the
>>>> command bits were clear. Then the ACPI generator would utilize this
>>>> special exception.
>>>>
>>>> Michael disagreed; in
>>>>
>>>> http://thread.gmane.org/gmane.comp.emulators.qemu/342206/focus=342242
>>>>
>>>> he wrote "[t]his is problematic - disabled BAR values have no meaning
>>>> according to the PCI spec".
>>>>
>>>
>>> Yes... because it looked like a hook for our case only,
>>> the good news is that this patch is based exactly on the fact that
>>> the BARs have no meaning if the bit is not set.
>>>
>>>> The current  solution to the problem (= we disable the SHPC) was
>>>> recommended by Michael in that message: "It might be best to add a
>>>> property to just disable shpc in the bridge so no devices reside
>>>> directly behind the pxb?"
>>>>
>>>
>>> I confess I don't exactly understand what the SHPC of the pci-2-pci
>>> bridge
>>> has to do with sibling devices on the pxb's root bus (SHPC is the
>>> hot-plug controller
>>> for the devices behind the pci-2-pci bridge).
>>>
>>> The second part I do understand, the pxb design was to not have devices
>>> directly behind
>>> the pxb, so maybe he meant that SHPC is the part of the pci-bridge that
>>> behaves like
>>> a device in the sense it requires IO/MEM resources.
>>>
>>> Bottom line, your solution for the PXB was just fine :)
>>>
>>>
>>>> In comparison, your patch doesn't change pci_bar_address(). Instead, it
>>>> modifies pci_update_mappings() *not to call* pci_bar_address(), if the
>>>> respective command bits are clear.
>>>>
>>>> I guess that could have about the same effect.
>>>>
>>>> If, unlike my patch, yours actually improves QEMU's compliance with the
>>>> PCI specs, then it's likely a good patch. (And apparently more general
>>>> than the SHPC-specific solution we have now.)
>>>
>>>
>>> Exactly! Why should a pci write to the command  register *delete*
>>> previously set resources? I am looking at it as a bug.
>>>
>>> And also updating the mappings while the Decoding bit is not enables
>>> is at least not necessary.
>>>
>>>>
>>>> I just don't know if it's a good idea to leave any old mappings active
>>>> while the BARs are being reprogrammed (with the command bits clear).
>>>>
>>>
>>> First, because the OS can't use the IO/MEM anyway, secondly the guest
>>> OS/firmware
>>> is the one that disabled the bit... (in order to program resources)
>>
>> I have something like the following in mind. Do you think it is a valid
>> (although contrived) use case?
>>
>> - guest programs some BAR and uses it (as designed / intended)
>> - guest disables command bit, modifies BAR location
>> - guest accesses *old* BAR location
>>
>> What should a guest *expect* in such a case? Is this invalid guest
>> behavior?
> 
> Yes, this is indeed invalid behavior, from the device point of view
> it is disabled. Best case scenario - the guest will see 0xffffffff,
> worst case - garbage.
> 
>>
>> If it is not invalid, then will QEMU comply with the guest's
>> expectations if your patch is applied? Pre-patch, the guest would likely
>> access a "hole" in the host bridge MMIO aperture, whereas with your
>> patch (I guess?) it still might access the device through the old (still
>> active) BAR?
>>
> 
> Since the IO is disabled, pci_bar_address will return PCI_BAR_UNMAPPED
> and *no updates will be made* pre or post this patch. It will behave the
> same
> from the guest point of view. It will still access the memory region
> of the device.
> 
> 
>> Or would QEMU prevent that just by virtue of the command bit being clear?
> 
> Again, this patch only changes the behavior in a specific case:
> when the device is disabled and the guest writes to the command register
> without
> enabling IO/MEM.

Ah, right! That's exactly what the edk2 PCI bus driver / enumerator
does. Massages the command register without touching those two bits, and...

> 
> Pre-patch -> the old BAR addresses are overridden with 0xffffffff
>              (I really think this is a bug, nobody asked for this!!)

the previously programmed BAR address gets lost. It's great that you
have the PCI knowledge to state that this is actually a bug! It had
looked fishy to me as well, but I couldn't make the same argument.

> Post-Patch -> the old BAR values are returned to the prev state ->
> correct behavior IMHO.

I agree.

> Please continue to ask questions until we get to the bottom of it. :)

Okay, I think I can now try to review this patch. See below.

> 
> Thanks,
> Marcel
> 
>>
>> Thanks
>> Laszlo
>>
>>>> In other words, what guarantees that this change will not regress
>>>> anything? (I'm not doubting -- I'm asking; I honestly don't know.)
>>>>
>>>> So I guess I'll defer to Michael on this one.
>>>
>>> Michael, do you agree with the above?
>>>
>>>>
>>>> In any case, I fully agree with your analysis of OVMF's behavior.
>>>
>>> Thanks! I looked for this bug in OVMF for some time now :)
>>> Marcel
>>>
>>>>
>>>> Thanks!
>>>> Laszlo
>>>>
>>>>> Thanks,
>>>>> Marcel
>>>>>
>>>>>    hw/pci/pci.c | 17 +++++++++++++++++
>>>>>    1 file changed, 17 insertions(+)
>>>>>
>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>>> index 168b9cc..f9127dc 100644
>>>>> --- a/hw/pci/pci.c
>>>>> +++ b/hw/pci/pci.c
>>>>> @@ -1148,6 +1148,7 @@ static void pci_update_mappings(PCIDevice *d)
>>>>>        PCIIORegion *r;
>>>>>        int i;
>>>>>        pcibus_t new_addr;
>>>>> +    uint16_t cmd = pci_get_word(d->config + PCI_COMMAND);
>>>>>
>>>>>        for(i = 0; i < PCI_NUM_REGIONS; i++) {
>>>>>            r = &d->io_regions[i];
>>>>> @@ -1156,6 +1157,22 @@ static void pci_update_mappings(PCIDevice *d)
>>>>>            if (!r->size)
>>>>>                continue;
>>>>>
>>>>> +        /*
>>>>> +         * Do not update the mappings until the command register's
>>>>> +         * Decode (I/O or memory) bit is not set. Two reasons:

I propose the following wording change (for noob's like myself :))

    Do not update this BAR's mapping if the command register's decode
    bit (I/O or memory, matching the BAR's type) is clear. Two
    reasons: ...

Spelling out "this BAR's mapping" is clearer to me -- we're looping over
the BAR's. (The end result is the same of course.)

>>>>> +         * - PCI Spec indicates that while the bit is not set
>>>>> +         *   the memory sizing is not finished.

I propose "BAR sizing".

>>>>> +         * - pci_bar_address will return PCI_BAR_UNMAPPED

I propose pci_bar_address() -- i.e., parens.

>>>>> +         *   and a previous value can be accidentally overridden

I recommend "may be unintentionally" over "can be accidentally".

>>>>> +         *   if the command register is modified (and not the BAR).
>>>>> +         * */

The last line should simply terminate the comment block -- runaway
asterisk I think.

>>>>> +        if (((r->type & PCI_BASE_ADDRESS_SPACE_IO) &&
>>>>> +             !(cmd & PCI_COMMAND_IO)) ||
>>>>> +            ((r->type != PCI_BASE_ADDRESS_SPACE_IO) &&
>>>>> +             !(cmd & PCI_COMMAND_MEMORY))) {
>>>>> +            continue;
>>>>> +        }
>>>>> +

It might be equivalent, but in the second part, I'd feel better about

    !(r->type & PCI_BASE_ADDRESS_SPACE_IO)

than

    (r->type != PCI_BASE_ADDRESS_SPACE_IO)

Or even better:

    uint16_t decode_bit = (r->type & PCI_BASE_ADDRESS_SPACE_IO) ?
                          PCI_COMMAND_IO :
                          PCI_COMMAND_MEMORY;
    if (!(cmd & decode_bit)) {
        continue;
    }


... The placement of the "continue" statement looks good.

Just some thoughts; I won't complain if the patch is committed as-is. :)

Thanks
Laszlo

>>>>>            new_addr = pci_bar_address(d, i, r->type, r->size);
>>>>>
>>>>>            /* This bar isn't changed */
>>>>>
>>>>
>>>>
>>>
>>
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]