qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCH] [Seabios] Over 4GB address ranges for 64bit PCI


From: Isaku Yamahata
Subject: [Qemu-devel] Re: [PATCH] [Seabios] Over 4GB address ranges for 64bit PCI BARs
Date: Fri, 5 Nov 2010 13:16:32 +0900
User-agent: Mutt/1.5.19 (2009-01-05)

Hi.
The current BAR allocation doesn't check overflow and some patches
are floating around which aren't merged yet.
There are several issues.

- overflow check
  This should be fixed.
  Some patches are proposed. None hasn't been merged yet.
  Your patch also addresses this issue.
  http://www.seabios.org/pipermail/seabios/2010-July/000794.html
  http://www.seabios.org/pipermail/seabios/2010-October/001089.html

- abstraction of pci bar allocation 
  This is very coding detail. This should be addressed.
  The current PCI BAR allocation uses primitives. So overflow check becomes
  ugly. So it would desirable to abstract it.
  Kevin suggested to use the existing PMM framework. But I concluded PMM
  doesn't suit for it, so introduced new api. But I haven't got any feedback
  from him yet.

- >4GB 64bit bar allocation
  Your patche tries to address this issue. But it breaks PCI-to-PCI
  bridge filtering support.

  If the BAR size is huge (or there are too many BARs), the bar can't
  be allocated under 4G. So several persons want seabios to allocate
  such BARs at >4GB area complaining that OS can't use BARs that seabios
  didn't assigned.

  Others think such BAR can be left unallocated.
  Seabios role is to setup minimal basic environment for bootloader
  to boot OS, 64bit bar allocation is beyond it's role.
  bootloader/rombios usually doesn't handle BARs that is allocated
  beyond 4GB, and Modern OSes can re-arrange PCI bar allocation itself.
  So 64bit bar allocation support wouldn't be needed.

  I'm not sure if there is enough demand to support 64bit BAR allocation
  and if Kevin will accept it or not. Consensus is needed.
  What OS are you using?

- bridge filtering
  64bit BAR allocation must be aware of PCI-to-PCI bridge filtering.
  I suppose one more pci bus walk (or two) would be necessary.
  i.e. make pci bar allocation two (or three) pass.  

thanks,

On Fri, Nov 05, 2010 at 03:13:38PM +1300, Alexey Korolev wrote:
> Hi,
> 
> We have seen some issues with 64bit PCI address and large BAR region support 
> in 
> seabios. 
> 
> On attempting to register a 64bit BAR of 1GB size we had some strange failures
> in qemu. After some debugging we find out that source of the issue was in 
> seabios PCI enumeration code. 
> 
> The issue takes place because of u32 overflow in pciinit.c. 
> ….........
>              if (ALIGN(*paddr, size) + size >= BUILD_PCIPREFMEM_END) {
>                  dprintf(1,
>                          "prefmem region of (bdf 0x%x bar %d) can't be 
> mapped. "
>                          "decrease BUILD_PCIMEM_SIZE and recompile. size 
> %x\n",
>                          bdf, region_num, BUILD_PCIPREFMEM_SIZE);
>                  size = 0;
>              }
> …...........
>        if (size > 0) {
>             *paddr = ALIGN(*paddr, size);
>             pci_set_io_region_addr(bdf, region_num, *paddr);
>             *paddr += size;
>        }
> 
> If size is greater than 0xFFFFFFFF ??? BUILD_PCIPREFMEM_START (256MB), 
> this call ALIGN(*paddr, size) will always return 0. The protection test 
> fails, size remains > 0, and guest memory mapping will be corrupted.
> 
> We have found that a very similar problem was discussed previously:
> 
> http://www.mail-archive.com/address@hidden/msg38090.html
> 
> The discussion also touches on the question of 64bit addressing for PCI BARs.
> In the current implementation regardless of whether the PCI BAR type is 64bit 
> or not
> the addressable range will be limited to the first 4GB. We also want to have 
> "true" 64bit
> addressable range for PCI BARs.
> 
> So to solve these issues some changes were made. 
> Tracking for overflows has been added as well as support for 64bit range. 
> To support the full 64bit address range a pci_bios_64bit_addr variable has 
> been added. 
> The 64bit range can be used only if BAR has PCI_BASE_ADDRESS_MEM_TYPE_64
> attribute and the given region doesn't fit in first 4GB.  The patch has been
> tested on Linux and BSD guest OS's and it appears to work fine. 
> 
> 
> 
> Signed-off-by Alexey Korolev <address@hidden>
> Signed-off-by Stephen Donelly <address@hidden>
> ---
>  pciinit.c |   83 
> ++++++++++++++++++++++++++++++++------------------------------
>  1 file changed, 43 insertions(+), 40 deletions(-)
> 
> diff --git a/src/pciinit.c b/src/pciinit.c
> index 0346423..86c8137 100644
> --- a/src/pciinit.c
> +++ b/src/pciinit.c
> @@ -20,6 +20,7 @@ static void pci_bios_init_device_in_bus(int bus);
>  static u32 pci_bios_io_addr;
>  static u32 pci_bios_mem_addr;
>  static u32 pci_bios_prefmem_addr;
> +static u64 pci_bios_64bit_addr;
>  /* host irqs corresponding to PCI irqs A-D */
>  const u8 pci_irqs[4] = {
>      10, 10, 11, 11
> @@ -56,9 +57,11 @@ static int pci_bios_allocate_region(u16 bdf, int 
> region_num)
>  {
>      u32 *paddr;
>      u32 ofs = pci_bar(bdf, region_num);
> -
>      u32 old = pci_config_readl(bdf, ofs);
>      u32 mask;
> +    u32 limit_addr;
> +    int is_64bit;
> +
>      if (region_num == PCI_ROM_SLOT) {
>          mask = PCI_ROM_ADDRESS_MASK;
>          pci_config_writel(bdf, ofs, mask);
> @@ -73,50 +76,49 @@ static int pci_bios_allocate_region(u16 bdf, int 
> region_num)
>      pci_config_writel(bdf, ofs, old);
>  
>      u32 size = (~(val & mask)) + 1;
> -    if (val != 0) {
> -        if (val & PCI_BASE_ADDRESS_SPACE_IO) {
> -            paddr = &pci_bios_io_addr;
> -            if (ALIGN(*paddr, size) + size >= 64 * 1024) {
> -                dprintf(1,
> -                        "io region of (bdf 0x%x bar %d) can't be mapped.\n",
> -                        bdf, region_num);
> -                size = 0;
> -            }
> -        } else if ((val & PCI_BASE_ADDRESS_MEM_PREFETCH) &&
> -                 /* keep behaviour on bus = 0 */
> -                 pci_bdf_to_bus(bdf) != 0 &&
> -                 /* If pci_bios_prefmem_addr == 0, keep old behaviour */
> -                 pci_bios_prefmem_addr != 0) {
> -            paddr = &pci_bios_prefmem_addr;
> -            if (ALIGN(*paddr, size) + size >= BUILD_PCIPREFMEM_END) {
> -                dprintf(1,
> -                        "prefmem region of (bdf 0x%x bar %d) can't be 
> mapped. "
> -                        "decrease BUILD_PCIMEM_SIZE and recompile. size 
> %x\n",
> -                        bdf, region_num, BUILD_PCIPREFMEM_SIZE);
> -                size = 0;
> -            }
> -        } else {
> -            paddr = &pci_bios_mem_addr;
> -            if (ALIGN(*paddr, size) + size >= BUILD_PCIMEM_END) {
> +    if (!val || !size)
> +        return 0;
> +
> +    is_64bit = !(val & PCI_BASE_ADDRESS_SPACE_IO) &&
> +        (val & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == 
> PCI_BASE_ADDRESS_MEM_TYPE_64;
> +
> +    if (val & PCI_BASE_ADDRESS_SPACE_IO) {
> +        paddr = &pci_bios_io_addr;
> +        limit_addr = 64 * 1024;
> +    } else if ((val & PCI_BASE_ADDRESS_MEM_PREFETCH) &&
> +             /* keep behaviour on bus = 0 */
> +             pci_bdf_to_bus(bdf) != 0 &&
> +             /* If pci_bios_prefmem_addr == 0, keep old behaviour */
> +             pci_bios_prefmem_addr != 0) {
> +        paddr = &pci_bios_prefmem_addr;
> +        limit_addr = BUILD_PCIPREFMEM_END;
> +    } else {
> +        paddr = &pci_bios_mem_addr;
> +        limit_addr = BUILD_PCIMEM_END;
> +    }
> +
> +    /* The region is out of allowed 32bit mapping range */
> +    if ((ALIGN(*paddr, size) + size >= limit_addr) ||
> +        (ALIGN(*paddr, size) + size <= *paddr)) {
> +            if (!is_64bit) {
>                  dprintf(1,
>                          "mem region of (bdf 0x%x bar %d) can't be mapped. "
> -                        "increase BUILD_PCIMEM_SIZE and recompile. size 
> %x\n",
> -                        bdf, region_num, BUILD_PCIMEM_SIZE);
> -                size = 0;
> +                        "Region size %x\n", bdf, region_num, size);
> +                return 0;
>              }
> -        }
> -        if (size > 0) {
> -            *paddr = ALIGN(*paddr, size);
> -            pci_set_io_region_addr(bdf, region_num, *paddr);
> -            *paddr += size;
> -        }
> +            pci_bios_64bit_addr = ALIGN(pci_bios_64bit_addr, size);
> +            pci_set_io_region_addr(bdf, region_num, 
> (u32)pci_bios_64bit_addr);
> +            pci_set_io_region_addr(bdf, region_num + 1,
> +                                   (u32)(pci_bios_64bit_addr >> 32));
> +            pci_bios_64bit_addr += size;
> +            return is_64bit;
>      }
>  
> -    int is_64bit = !(val & PCI_BASE_ADDRESS_SPACE_IO) &&
> -        (val & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == 
> PCI_BASE_ADDRESS_MEM_TYPE_64;
> -    if (is_64bit && size > 0) {
> -        pci_config_writel(bdf, ofs + 4, 0);
> -    }
> +    *paddr = ALIGN(*paddr, size);
> +    pci_set_io_region_addr(bdf, region_num, *paddr);
> +    if (is_64bit)
> +        pci_set_io_region_addr(bdf, region_num + 1, 0);
> +    *paddr += size;
>      return is_64bit;
>  }
>  
> @@ -409,6 +411,7 @@ pci_setup(void)
>      pci_bios_io_addr = 0xc000;
>      pci_bios_mem_addr = BUILD_PCIMEM_START;
>      pci_bios_prefmem_addr = BUILD_PCIPREFMEM_START;
> +    pci_bios_64bit_addr = ALIGN(RamSizeOver4G + 0x100000000ull, 
> 0x100000000ull);
>  
>      pci_bios_init_bus();
>  
> 
> 

-- 
yamahata



reply via email to

[Prev in Thread] Current Thread [Next in Thread]