[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [BUG] Guest OS hangs on boot when 64bit BAR present (kv
From: |
Michael S. Tsirkin |
Subject: |
Re: [Qemu-devel] [BUG] Guest OS hangs on boot when 64bit BAR present (kvm-apic -msi resource conflict) |
Date: |
Wed, 13 Feb 2013 12:48:24 +0200 |
On Wed, Feb 13, 2013 at 11:34:52AM +0100, Jan Kiszka wrote:
> On 2013-02-13 11:24, Michael S. Tsirkin wrote:
> > On Wed, Feb 13, 2013 at 06:06:37PM +1300, Alexey Korolev wrote:
> >> Sometime ago I reported an issue about guest OS hang when 64bit BAR
> >> present.
> >> http://lists.gnu.org/archive/html/qemu-devel/2012-01/msg03189.html
> >> http://lists.gnu.org/archive/html/qemu-devel/2012-12/msg00413.html
> >>
> >> Some more investigation has been done, so in this post I'll try to explain
> >> why it happens and offer possible solutions:
> >>
> >> *When the issue happens*
> >> The issue occurs on Linux guest OS if kernel version <2.6.36
> >> A Guest OS hangs on boot when a 64bit PCI BAR is present in a system (if
> >> we use ivshmem driver for example) and occupies range within first
> >> 4 GB.
> >>
> >> *How to reproduce*
> >> I used the following qemu command to reproduce the case:
> >> /usr/local/bin/qemu-system-x86_64 -M pc-1.3 -enable-kvm -m 2000 -smp
> >> 1,sockets=1,cores=1,threads=1 -name Rh5332 -chardev
> >> socket,id=charmonitor,path=/var/lib/libvirt/qemu/Rh5332.monitor,server,nowait
> >> -mon chardev=charmonitor,id=monitor,mode=readline -rtc
> >> base=utc -boot cd -drive
> >> file=/home/akorolev/rh5332.img,if=none,id=drive-ide0-0-0,format=raw -device
> >> ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -chardev
> >> file,id=charserial0,path=/home/akorolev/serial.log -device
> >> isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -k en-us
> >> -vga cirrus -device ivshmem,shm,size=32M-device
> >> virtio-balloon-pci,id=balloon0
> >>
> >> Tried different guests: Centos 5.8 64bit, RHEL 5.3 32bit, FC 12 64bit on
> >> all machines hang occurs in 100% cases
> >>
> >> *Why it happens*
> >> The issue basically comes from Linux PCI enumeration code.
> >>
> >> The OS enumerates 64BIT bars when device is enabled using the following
> >> procedure.
> >> 1. Write all FF's to lower half of 64bit BAR
> >> 2. Write address back to lower half of 64bit BAR
> >> 3. Write all FF's to higher half of 64bit BAR
> >> 4. Write address back to higher half of 64bit BAR
> >>
> >> For qemu it means that qemu pci_default_write_config() recevies all FFs
> >> for lower part of the 64bit BAR.
> >> Then it applies the mask and converts the value to "All FF's - size + 1"
> >> (FE000000 if size is 32MB).
> >>
> >> So for short period of time the range [0xFE000000 - 0xFFFFFFFF] will be
> >> occupied by ivshmem resource.
> >> For some reason it is lethal for further boot process.
> >>
> >> We have found that boot process screws up completely if kvm-apic-msi range
> >> is overlapped even for short period of time. (We still don't
> >> know why it happens, hope that the qemu maintainers can answer?)
> >>
> >> If we look at kvm-apic-msi memory region it is a non-overlapable memory
> >> region with hardcoded address range [0xFEE00000 - 0xFEF00000].
> >
> > Thanks for looking into this!
> >
> >> Here is a log we collected from render_memory_regions:
> >>
> >> system overlap 0 pri 0 [0x0 - 0x7fffffffffffffff]
> >> kvmvapic-rom overlap 1 pri 1000 [0xca000 - 0xcd000]
> >> pc.ram overlap 0 pri 0 [0xca000 - 0xcd000]
> >> ++ pc.ram [0xca000 - 0xcd000] is added to view
> >> ....................
> >> smram-region overlap 1 pri 1 [0xa0000 - 0xc0000]
> >> pci overlap 0 pri 0 [0xa0000 - 0xc0000]
> >> cirrus-lowmem-container overlap 1 pri 1 [0xa0000 - 0xc0000]
> >> cirrus-low-memory overlap 0 pri 0 [0xa0000 - 0xc0000]
> >> ++cirrus-low-memory [0xa0000 - 0xc0000] is added to view
> >> kvm-ioapic overlap 0 pri 0 [0xfec00000 - 0xfec01000]
> >> ++kvm-ioapic [0xfec00000 - 0xfec01000] is added to view
> >> pci-hole64 overlap 0 pri 0 [0x100000000 - 0x4000000100000000]
> >> pci overlap 0 pri 0 [0x100000000 - 0x4000000100000000]
> >> pci-hole overlap 0 pri 0 [0x7d000000 - 0x100000000]
> >
> > So we have ioapic and pci-hole which should be non-overlap,
> > actually overlap each other.
> > Isn't this a problem?
> >
> >> pci overlap 0 pri 0 [0x7d000000 - 0x100000000]
> >> ivshmem-bar2-container overlap 1 pri 1 [0xfe000000 -
> >> 0x100000000]
> >> ivshmem.bar2 overlap 0 pri 0 [0xfe000000 - 0x100000000]
> >> ++ivshmem.bar2 [0xfe000000 - 0xfec00000] is added to view
> >> ++ivshmem.bar2 [0xfec01000 - 0x100000000] is added to view
> >> ivshmem-mmio overlap 1 pri 1 [0xfebf1000 - 0xfebf1100]
> >> e1000-mmio overlap 1 pri 1 [0xfeba0000 - 0xfebc0000]
> >> cirrus-mmio overlap 1 pri 1 [0xfebf0000 - 0xfebf1000]
> >> cirrus-pci-bar0 overlap 1 pri 1 [0xfa000000 - 0xfc000000]
> >> vga.vram overlap 1 pri 1 [0xfa000000 - 0xfa800000]
> >> ++vga.vram [0xfa000000 - 0xfa800000] is added to view
> >> cirrus-bitblt-mmio overlap 0 pri 0 [0xfb000000 -
> >> 0xfb400000]
> >> ++cirrus-bitblt-mmio [0xfb000000 - 0xfb400000] is added to
> >> view
> >> cirrus-linear-io overlap 0 pri 0 [0xfa000000 - 0xfa800000]
> >> pc.bios overlap 0 pri 0 [0xfffe0000 - 0x100000000]
> >> ram-below-4g overlap 0 pri 0 [0x0 - 0x7d000000]
> >> pc.ram overlap 0 pri 0 [0x0 - 0x7d000000]
> >> ++pc.ram [0x0 - 0xa0000] is added to view
> >> ++pc.ram [0x100000 - 0x7d000000] is added to view
> >> kvm-apic-msi overlap 0 pri 0 [0xfee00000 - 0xfef00000]
> >>
> >> As you can see from log the kvm-apic-msi is enumarated last when range
> >> [0xfee00000 - 0xfef00000] is already occupied by ivshmem.bar2
> >> [0xfec01000 - 0x100000000].
> >>
> >>
> >> *Possible solutions*
> >> Solution 1. Probably the best would be adding the rule that regions which
> >> may not be overlapped are added to view first (In in other words
> >> regions which must not be overlapped have the highest priority). Please
> >> find patch in the following message.
> >>
> >> Solution 2. Raise priority of kvm-apic-msi resource. This is a bit
> >> misleading solution, as priority is only applicable for overlap-able
> >> regions, but this region must not be overlapped.
> >>
> >> Solution 3. Fix the issue at PCI level. Track if the resource is 64bit and
> >> apply changes if both parts of 64bit BAR are programmed. (It
> >> appears that real PCI bus controllers are smart enough to track 64bit BAR
> >> writes on PC, so qemu could do the same? Drawbacks are that
> >> tracking PCI writes is bit cumbersome, and such tracking may appear to
> >> somebody as a hack)
> >>
> >>
> >> Alexey
> >
> > I have to say I don't understand what does the overlap attribute
> > supposed to do, exactly.
> >
> > In practice it currently seems to be ignored.
> >
> > How about we drop it and rely exclusively on priorities?
>
> Isn't it a guest bug if it maps a PCI resource over the APIC window? How
> should the guest access that region then? With in-kernel APIC, it will
> remain unaccessible, even if you raise the prio of that PCI region over
> the APIC as the kernel will prioritize the latter.
Yes that would be fine (it's a temporary condition while sizing the
BARs). The problem, I think, is that currently apic has default priority
(0) so PCI regions can mask it during sizing.
> > It's probably easier to just give the apic high priority.
> > there's precedent - kvmvapic does:
> >
> > memory_region_add_subregion_overlap(as, rom_paddr, &s->rom, 1000);
> >
> > Jan, could you please clarify where did the value 1000 come from?
>
> To ensure that the VAPIC region is always accessible.
>
> >
> > Maybe we need some predefined priority values in memory.h
> >
>
> That makes some sense, sure.
>
> Jan
>
> --
> Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
> Corporate Competence Center Embedded Linux
- [Qemu-devel] [BUG] Guest OS hangs on boot when 64bit BAR present (kvm-apic -msi resource conflict), Alexey Korolev, 2013/02/13
- [Qemu-devel] [PATCH for-1.4] pc: tag apic as overlap region, Michael S. Tsirkin, 2013/02/19
- Re: [Qemu-devel] [PATCH for-1.4] pc: tag apic as overlap region, Jan Kiszka, 2013/02/19
- Re: [Qemu-devel] [PATCH for-1.4] pc: tag apic as overlap region, Peter Maydell, 2013/02/19
- Re: [Qemu-devel] [PATCH for-1.4] pc: tag apic as overlap region, Michael S. Tsirkin, 2013/02/19
- Re: [Qemu-devel] [PATCH for-1.4] pc: tag apic as overlap region, Jan Kiszka, 2013/02/19
- Re: [Qemu-devel] [PATCH for-1.4] pc: tag apic as overlap region, Peter Maydell, 2013/02/19
- Re: [Qemu-devel] [PATCH for-1.4] pc: tag apic as overlap region, Jan Kiszka, 2013/02/19
Re: [Qemu-devel] [PATCH for-1.4] pc: tag apic as overlap region, Alexey Korolev, 2013/02/21