[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [Bug 1297651] [NEW] KVM create a win7 guest with Qemu,
From: |
Michael S. Tsirkin |
Subject: |
Re: [Qemu-devel] [Bug 1297651] [NEW] KVM create a win7 guest with Qemu, it boots up fail |
Date: |
Wed, 26 Mar 2014 16:54:31 +0200 |
On Wed, Mar 26, 2014 at 02:48:29PM +0100, Igor Mammedov wrote:
> On Wed, 26 Mar 2014 14:58:28 +0200
> "Michael S. Tsirkin" <address@hidden> wrote:
>
> > On Wed, Mar 26, 2014 at 01:28:02PM +0100, Laszlo Ersek wrote:
> > > On 03/26/14 11:31, Michael S. Tsirkin wrote:
> > >
> > > > On Wed, Mar 26, 2014 at 06:45:10AM -0000, Robert Hu wrote:
> > >
> > > >> Date: Mon Mar 17 17:05:16 2014 +0100
> > > >>
> > > >> i386/acpi-build: allow more than 255 elements in CPON
> > > >>
> > > >> The build_ssdt() function builds a number of AML objects that are
> > > >> related
> > > >> to CPU hotplug, and whose IDs form a contiguous sequence of APIC
> > > >> IDs.
> > > >> (APIC IDs are in fact discontiguous, but this is the traditional
> > > >> interface: build a contiguous sequence from zero up that covers all
> > > >> possible APIC IDs.) These objects are:
> > > >>
> > > >> - a Processor() object for each VCPU,
> > > >> - a NTFY method, with one branch for each VCPU,
> > > >> - a CPON package with one element (hotplug status byte) for each
> > > >> VCPU.
> > > >>
> > > >> The build_ssdt() function currently limits the *count* of processor
> > > >> objects, and NTFY branches, and CPON elements, in 0xFF (see the
> > > >> assignment
> > > >> to "acpi_cpus"). This allows for an inclusive APIC ID range of
> > > >> [0..254].
> > > >> This is incorrect, because the highest APIC ID that we otherwise
> > > >> allow a
> > > >> VCPU to take is 255.
> > > >>
> > > >> In order to extend the maximum count to 256, and the traversed
> > > >> APIC ID
> > > >> range correspondingly to [0..255]:
> > > >> - the Processor() objects need no change,
> > > >> - the NTFY method also needs no change,
> > > >> - the CPON package must be updated, because it is defined with a
> > > >> DefPackage, and the number of elements in such a package can be
> > > >> at most
> > > >> 255. We pick a DefVarPackage instead.
> > > >>
> > > >> We replace the Op byte, and the encoding of the number of elements.
> > > >> Compare:
> > > >>
> > > >> DefPackage := PackageOp PkgLength NumElements
> > > >> PackageElementList
> > > >> DefVarPackage := VarPackageOp PkgLength VarNumElements
> > > >> PackageElementList
> > > >>
> > > >> PackageOp := 0x12
> > > >> VarPackageOp := 0x13
> > > >
> > > >
> > > > I think I know what's going on here: the specification says:
> > > >
> > > > The ASL compiler can emit two different AML opcodes for a Package
> > > > declaration, either PackageOp or VarPackageOp. For small, fixed-length
> > > > packages, the PackageOp is used and this
> > > >
> > > > opcode is compatible with ACPI 1.0. A VarPackageOp will be emitted if
> > > > any of the following conditions are true:
> > > > •
> > > > The NumElements argument is a TermArg that can only be resolved at
> > > > runtime.
> > > > •
> > > > At compile time, NumElements resolves to a constant that is larger than
> > > > 255.
> > > > •
> > > > The PackageList contains more than 255 initializer elements.
> > > >
> > > >
> > > > So we clearly violate this rule.
> > >
> > > I did see this passage of the spec, but it is not relevant. It is about
> > > what the ASL compiler does. It comes from:
> > >
> > > 19 ACPI Source Language (ASL)Reference
> > > 19.5 ASL Operator Reference
> > > 19.5.98 Package (Declare Package Object)
> > >
> > > We do not have an ASL compiler at hand.
> >
> > True. But I think the spec and guests simply didn't envision writing AML by
> > hand :)
> >
> > > The specification nowhere restricts VarPackageOp to > 255.
> > >
> > > However, what I *did* mess up is compatibility with ACPI 1.0. Just below
> > > the quoted part, there's also this:
> > >
> > > Note: The ability to create variable-sized packages was first
> > > introduced in ACPI 2.0. ACPI 1.0 only allowed fixed-size
> > > packages with up to 255 elements.
> > >
> > > I forgot that the header of the containing table stated the ACPI version:
> > >
> > > /* Copy header and patch values in the S3_ / S4_ / S5_ packages */
> > > ssdt_ptr = acpi_data_push(table_data, sizeof(ssdp_misc_aml));
> > >
> > > and
> > >
> > > DefinitionBlock ("ssdt-misc.aml", "SSDT", 0x01, "BXPC", "BXSSDTSUSP",
> > > 0x1)
> > > ^^^^
> > > ComplianceRevision
> > >
> > > So my patch generates ACPI 2.0+ contents for an 1.0 table.
> > >
> > > > The following seems to fix the issue - still testing. Can you confirm
> > > > please?
> > >
> > > This patch only restricts the bug to a subset of cases, but it doesn't
> > > fix it.
> > >
> > > > However the question we should ask is whether
> > > > it's a good idea to allow hotplug ID values that might
> > > > make guests fail to boot.
> > > >
> > > > How about limiting ACPI_CPU_HOTPLUG_ID_LIMIT to 255?
> > >
> > > I think it's not the package size / APIC ID value per se that breaks the
> > > boot, but the incompatibility between the ACPI revision stated in the
> > > SSDT header, and the construct in the table.
> >
> >
> > It would be interesting to try tweaking the table version and seeing
> > what happens. Does it help any guests?
> It won't help XP based guests since they support 1.0 revision only.
XP seems to be happy even if I set DSDT revision to 2.
Having said that, it still crashes if you try to use VarPackageOp.
> >
> > > >
> > > > We never allowed > 255 in the past, is it worth the
> > > > maintainance headaches?
> > > >
> > > >
> > > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > > > index f1054dd..7597517 100644
> > > > --- a/hw/i386/acpi-build.c
> > > > +++ b/hw/i386/acpi-build.c
> > > > @@ -1055,9 +1055,21 @@ build_ssdt(GArray *table_data, GArray *linker,
> > > >
> > > > {
> > > > GArray *package = build_alloc_array();
> > > > - uint8_t op = 0x13; /* VarPackageOp */
> > > > + uint8_t op;
> > > > +
> > > > + /*
> > > > + * Note: The ability to create variable-sized packages was
> > > > first introduced in ACPI 2.0. ACPI 1.0 only
> > > > + * allowed fixed-size packages with up to 255 elements.
> > > > + * Windows guests up to win2k8 fail when VarPackageOp is
> > > > used.
> > > > + */
> > > > + if (acpi_cpus <= 255) {
> > > > + op = 0x12; /* PackageOp */
> > > > + build_append_byte(package, acpi_cpus); /* NumElements
> > > > */
> > > > + } else {
> > > > + op = 0x13; /* VarPackageOp */
> > > > + build_append_int(package, acpi_cpus); /*
> > > > VarNumElements */
> > > > + }
> > > >
> > > > - build_append_int(package, acpi_cpus); /* VarNumElements */
> > > > for (i = 0; i < acpi_cpus; i++) {
> > > > uint8_t b = test_bit(i, cpu->found_cpus) ? 0x01 : 0x00;
> > > > build_append_byte(package, b);
> > > >
> > >
> > > The patch will mask the problem for most of the cases, but when
> > > VarPackageOp is used, it will be broken just the same (because the ACPI
> > > revision in the SSDT header will still mismatch).
> >
> > yes but modern guests don't seem to care, and it was already broken in
> > 1.7, wasn't it?
> >
> > > Here's my proposal:
> > > - I can post a patch that changes the SSDT DSL files, *and* the DSDT
> > > files (q35-acpi-dsdt.dsl acpi-dsdt.dsl), to state an ACPI revision of
> > > 2.0. (The ACPI revision of the DSDT file determines integer sizes for
> > > SSDTs too, so we can't just bump the SSDTs to 2.0)
> >
> > It should not be a problem but I'm not sure I get this comment. Can you
> > explain?
> >
> > > - Or I suggest to revert my patches for 2.0.
> > >
> > > You probably won't like bumping the ACPI rev in DSDT/SSDT, for various
> > > compatibility reasons, so I think I suggest to revert these two patches
> > > of mine. It's now clear to me that this VCPU hotplug limit cannot be
> > > lifted without much more intrusive (and guest visible) changes. Sorry
> > > about missing this fact in my original submission.
> > >
> > > Thanks,
> > > Laszlo
> >
> > I have a problem with both approaches :)
> >
> > If we want to change ACPI rev, I think we should do this
> > conditionally when max_cpus > 255.
> > Would be worth it if this fixes some guests.
> >
> > As for reverting, I think it's a problem that we seem to
> > allow max_cpus = 256 but then it doesn't really work.
> more clean would be to abort if CPON index (i.e. APIC ID)
> is more than 255. That would affect small number of weird
> topologies but sould be fine for most usecases.
Any idea how to trigger this behaviour?
> >
> >
> >
> > I think the patch I posted might be good enough for 2.0.
> > It seems to make things work for new guests, and old guests
> > work as long as you don't specify max_cpus = 255.
> > The config with a high max_cpus value never really worked so
> > not a big deal.
> >
> >
> > Alternatively limit max_cpus to 255, to make it fail cleanly.
> that won't work since max_cpus is not equivalent to index in CPON
> which depends on topology as well.