qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2] hw/arm/virt-acpi - reserve ECAM space as PNP


From: Laszlo Ersek
Subject: Re: [Qemu-devel] [PATCH v2] hw/arm/virt-acpi - reserve ECAM space as PNP0C02 device
Date: Mon, 16 Jan 2017 23:35:04 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0

On 01/16/17 22:23, Ard Biesheuvel wrote:
> On 16 January 2017 at 21:13, Laszlo Ersek <address@hidden> wrote:
>> On 01/16/17 20:31, Ard Biesheuvel wrote:
>>> On 16 January 2017 at 18:20, Peter Maydell <address@hidden> wrote:
>>>> On 16 January 2017 at 17:30, Ard Biesheuvel <address@hidden> wrote:
>>>>> On 16 January 2017 at 17:25, Peter Maydell <address@hidden> wrote:
>>>>>> On 13 January 2017 at 17:32, Ard Biesheuvel <address@hidden> wrote:
>>>>>>> Linux for arm64 v4.10 and later will complain if the ECAM config space 
>>>>>>> is
>>>>>>> not reserved in the ACPI namespace:
>>>>>>>
>>>>>>>   acpi PNP0A08:00: [Firmware Bug]: ECAM area [mem 
>>>>>>> 0x3f000000-0x3fffffff] not reserved in ACPI namespace
>>>>>>>
>>>>>>> The rationale is that OSes that don't consume the MCFG table should 
>>>>>>> still
>>>>>>> be able to infer that the PCI config space MMIO region is occupied.
>>>>>>>
>>>>>>> So update the ACPI table generation routine to add this reservation.
>>>>>>>
>>>>>>> Signed-off-by: Ard Biesheuvel <address@hidden>
>>>>>>> ---
>>>>>>>  hw/arm/virt-acpi-build.c | 7 +++++++
>>>>>>>  1 file changed, 7 insertions(+)
>>>>>>>
>>>>>>> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
>>>>>>> index 085a61117378..50d52f685f68 100644
>>>>>>> --- a/hw/arm/virt-acpi-build.c
>>>>>>> +++ b/hw/arm/virt-acpi-build.c
>>>>>>> @@ -310,6 +310,13 @@ static void acpi_dsdt_add_pci(Aml *scope, const 
>>>>>>> MemMapEntry *memmap,
>>>>>>>      Aml *dev_rp0 = aml_device("%s", "RP0");
>>>>>>>      aml_append(dev_rp0, aml_name_decl("_ADR", aml_int(0)));
>>>>>>>      aml_append(dev, dev_rp0);
>>>>>>> +
>>>>>>> +    Aml *dev_res0 = aml_device("%s", "RES0");
>>>>>>> +    aml_append(dev_res0, aml_name_decl("_HID", aml_string("PNP0C02")));
>>>>>>> +    crs = aml_resource_template();
>>>>>>> +    aml_append(crs, aml_memory32_fixed(base_ecam, size_ecam, 
>>>>>>> AML_READ_WRITE));
>>>>>>> +    aml_append(dev_res0, aml_name_decl("_CRS", crs));
>>>>>>> +    aml_append(dev, dev_res0);
>>>>>>>      aml_append(scope, dev);
>>>>>>>  }
>>>>>>
>>>>>> This needs to be controlled via the machine class back-compat
>>>>>> machinery in hw/arm/virt.c so that it only happens for virt-2.9
>>>>>> and later.
>>>>>>
>>>>>
>>>>> Why exactly?
>>>>
>>>> Because the "virt-2.8" machine has to present to the guest
>>>> exactly what "virt" did as of the QEMU 2.8 release, including
>>>> any bugs or missing things we happened to have in our ACPI
>>>> tables. This allows cross-version compatibility (including
>>>> VM migration). Drew will have a more detailed explanation
>>>> if you need it.
>>>>
>>>
>>> I suspected as much.
>>>
>>> But in this case, I am not sure if it is worth the trouble: the
>>> generated data is only consumed at boot time by the firmware, and I
>>> suppose migration involves freezing a VM, including whatever resident
>>> firmware image was used to boot the OS, and so this is unlikely to
>>> affect migration.
>>>
>>> But I will let Drew explain ...
>>
>> The PCI Firmware Specification (rev 3.1) says in 4.1.2. "MCFG Table
>> Description": "The resources can *optionally* be returned in [...]
>> EFIGetMemoryMap as reserved memory [...]". (Emphasis mine.) Linux seems
>> to *insist* on this kind of reservation however.
>>
> 
> No, not at the UEFI level but at the ACPI level. Reservations in the
> UEFI memory map describe memory not MMIO space
> 
>> PNP0C02 is "General ID for reserving resources required by PnP
>> motherboard registers. (Not device specific.)", according to
>> <http://www.plasma-online.de/english/identify/serial/pnp_id_pnp.html>.
>> So what this patch does is reserve a memory area through ACPI,
>> practically as an unspecified "platform resource".
>>
> 
> This has been discussed at great length on the linux mailing lists
> 
> https://patchwork.kernel.org/patch/9453149/
> 
>> There's an alternative that's contained entirely in the firmware. You
>> can cover the MMCONFIG area in ArmVirtQemu with an EfiReservedMemoryType
>> memory map entry (by producing an appropriate memalloc HOB in PEI, or by
>> calling the appropriate gDS memory space map functions in DXE). OVMF
>> does the former (memalloc HOB).
>>
>> In ArmVirtQemu, we grab the MMCONFIG range from "pci-host-ecam-generic",
>> from QEMU's DTB. If you don't dislike the idea, we could cover the range
>> as well, right in "ArmVirtPkg/Library/FdtPciPcdProducerLib". That lib
>> instance already sets the base address PCD, and makes sure that the
>> relevant code is executed only once (in whatever driver module the
>> library instance was built into). You could call the gDS functions
>> mentioned above from that spot. (The library instance is already
>> restricted to DXE_DRIVER and UEFI_DRIVER modules.)
>>
> 
> In general, I think describing MMIO in the UEFI memory map is not very
> useful, and counter to the spec, which mentions that the memory map
> describes memory ("however it is used"), not memory *space* (unless
> UEFI itself needs to access it to implement runtime services)
> 

The UEFI memory map will reflect allocations from the GCD memory space,
for the Reserved and MMIO types. See "Figure 2. GCD Memory State
Transitions" in "7.2.2 GCD Memory Resources", Vol2 of the PI spec.

See also "9.7.1 UEFI Boot Services Dependencies" in the same,

  9.7.1.8 GetMemoryMap()

  The GetMemoryMap() implementation must include into the UEFI memory
  map all GCD map entries of types EfiGcdMemoryTypeReserved and
  EfiPersistentMemory, and all GCD map entries of type
  EfiGcdMemoryTypeMemoryMappedIo that have EFI_MEMORY_RUNTIME attribute
  set.

(Note that I wrote Reserved earlier, not MMIO.)

However, you are right that *just* the UEFI memmap entry is not
sufficient, according to the PCI firmware spec. (Regardless of the fact
that in practice, just the memmap entry does keep Linux happy. Or is it
about to change?)

Namely, looking again at the spot I quoted above (and it's also quoted
in the kernel docs patch that you linked above, under ref [6]), we find

    If the operating system does not natively comprehend reserving the
    MMCFG region, the MMCFG region must be reserved by firmware. The
    address range reported in the MCFG table or by _CBA method (see
    Section 4.1.3) must be reserved by declaring a motherboard resource.
    For most systems, the motherboard resource would appear at the root
    of the ACPI namespace (under \_SB) in a node with a _HID of EISAID
    (PNP0C02), and the resources in this case should not be claimed in
    the root PCI bus’s _CRS. The resources can optionally be returned in
    Int15 E820 or EFIGetMemoryMap as reserved memory but must always be
    reported through ACPI as a motherboard resource.

Therefore I agree that reserving the MMCONFIG area via a PNP0C02 object
in QEMU's ACPI payload improves spec conformance.

(Actually, the argument can be made for x86/Q35 as well. Adding Marcel
and MST.)

... Beyond the machine-type dependency raised by Peter (which I gather
is still being discussed), I suggest that the commit message of this
patch quote the relevant passage from the PCI fw spec in full (see
above, or in the kernel docs patch).

Thanks!
Laszlo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]