qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH V14 2/3] pc: add a Virtual Machine Generation ID


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] [PATCH V14 2/3] pc: add a Virtual Machine Generation ID device
Date: Wed, 4 Mar 2015 20:12:31 +0100

On Wed, Mar 04, 2015 at 05:33:42PM +0100, Igor Mammedov wrote:
> On Wed, 4 Mar 2015 16:31:39 +0100
> "Michael S. Tsirkin" <address@hidden> wrote:
> 
> > On Wed, Mar 04, 2015 at 04:14:44PM +0100, Igor Mammedov wrote:
> > > On Wed, 4 Mar 2015 14:49:00 +0100
> > > "Michael S. Tsirkin" <address@hidden> wrote:
> > > 
> > > > On Wed, Mar 04, 2015 at 02:12:32PM +0100, Igor Mammedov wrote:
> > > > > On Wed, 4 Mar 2015 13:11:48 +0100
> > > > > "Michael S. Tsirkin" <address@hidden> wrote:
> > > > > 
> > > > > > On Tue, Mar 03, 2015 at 09:33:51PM +0100, Igor Mammedov wrote:
> > > > > > > On Tue, 3 Mar 2015 18:35:39 +0100
> > > > > > > "Michael S. Tsirkin" <address@hidden> wrote:
> > > > > > > 
> > > > > > > > On Tue, Mar 03, 2015 at 05:18:14PM +0100, Igor Mammedov wrote:
> > > > > > > > > Based on Microsoft's sepecifications (paper can be dowloaded 
> > > > > > > > > from
> > > > > > > > > http://go.microsoft.com/fwlink/?LinkId=260709), add a device
> > > > > > > > > description to the SSDT ACPI table and its implementation.
> > > > > > > > > 
> > > > > > > > > The GUID is set using "vmgenid.uuid" property.
> > > > > > > > > 
> > > > > > > > > Example of using vmgenid device:
> > > > > > > > >  -device 
> > > > > > > > > vmgenid,id=FOO,uuid="324e6eaf-d1d1-4bf6-bf41-b9bb6c91fb87"
> [...]
> 
> > > > 
> > > > > > 
> > > > > > 
> > > > > > BTW, why do we need to stick vmgen_buf_paddr in the info?
> > > > > Because according to MS spec device should have ADDR object
> > > > > with physical buffer address packed in Package(2). So that
> > > > > Windows could read value from there.
> > > > > 
> > > > > [...]
> > > > 
> > > > Yes but why not read the property when and where we
> > > > need it?
> > > It's basically to fit the style used in acpi-build.c
> > > where we collect info by reading properties in
> > >  acpi_get_pm_info(), acpi_get_misc_info(), acpi_get_pci_info() ...
> > > and then just use pm, misc, pci in build_ssdt()
> > > should we drop all above and just inline it in build_ssdt() ?
> > 
> > The issue is you have two items to track here:
> > - addr - you stick that in the info struct
> > - full object address - you don't
> > an inconsistency that I dislike.
> What is "full object address"?

where you look up the vmgen id pci device.

> 
> > > > > > > > > +    name = g_strdup_printf("PCI0%s.S%.02X_", name ? name : 
> > > > > > > > > "",
> > > > > > > > > pdev->devfn);
> > > > > > > > > +    g_free(last);
> > > > > > > > > +    return name;
> > > > > > > > > +}
> > > > > > > > 
> > > > > > > > Looks tricky, and duplicates logic for device naming.
> > > > > > > > All this won't be necessary if you just add this as child
> > > > > > > > of the correct device, without playing with scope.
> > > > > > > > Why not do it?
> > > > > > > since vmgenid PCI device is located somewhere on PCI bus we don't 
> > > > > > > have
> > > > > > > fixed PATH to it and we need full path to it to send Notivy from
> > > > > > > "\\_GPE" scope see "aml_notify(aml_name("\\_SB.%s", vgid_path)" 
> > > > > > > below.
> > > > > > 
> > > > > > I see. Still - can't this function return the full aml_name?
> > > > > it's possible but I'd prefer to return back to 2 ACPI devices as it 
> > > > > was
> > > > > in v13 since Windows sees 2 devices anyway, even if they merged into 
> > > > > one
> > > > > PCI device description (which probably wrong but windows handles it 
> > > > > because
> > > > > PCI Standard RAM controller is driver less) and get rid of
> > > > > acpi_get_pci_dev_scope_name() thing.
> > > > 
> > > > OK but I think it should be under PCI0 at least,
> > > > since that one claims the relevant resource in its CRS.
> > > vmgenid device doesn't claim any resource if we use PCI for its
> > > implementation since corresponding PCI device claims its BAR.
> > > But I don't see any problem in putting VGID device into PCI0 scope.
> > > 
> > > > 
> > > > > It will also help if vmgenid will be a part of multifunction device,
> > > > > which current build_append_pci_bus_devices() ignores for now (i.e. it
> > > > > describes only function 0 devices on slot).
> > > > > 
> > > > > [...]
> > > > 
> > > > OK, though we might need to add the description for the pci device 
> > > > anyway
> > > > e.g. in order to mark it hidden.
> > > Experiments show that Windows ignores _STA for PCI devices,
> > > it looks like it completely ignores SXX devices in ACPI for present at 
> > > boot
> > > devices except of _EJ().
> > > BTW: I've already tried it, it doesn't hide anything.
> > >  
> > > [...]
> > 
> > So it boils down to the fact that windows thinks it's RAM,
> It thinks it's PCI Standard RAM Controller not RAM itself.
> 
> > so it binds a generic driver to it, but then we get
> According to device manager no driver is bound to it and neither needed.
> 
> > lucky and it does not try to use it as RAM.
> Video cards also use a bunch of "PCI Standard RAM Controller"
> devices I guess to expose additional VRAM,
> That doesn't mean that BARs are to be used by OS as conventional RAM
> it's rather for usage by vendor's driver.
> Same goes for ivshmem which is also expose RAM bar and has the same CLASS ID,
> BAR's RAM is used only by means of ivshmem driver.
> 
> But yes we get lucky that Windows has stub device description.

OK. So if you are going to rely on this,
I think it's a good idea to get ack from David to confirm
this is solvable for pseries.


> > Is that a fair summary? If yes, to me, it looks exactly like the reverse
> > side of the pseries issue.
> > 
> > 
> > > > > > > > > +static void vmgenid_get_vmgid_addr(Object *obj, Visitor *v, 
> > > > > > > > > void
> > > > > > > > > *opaque,
> > > > > > > > > +                                   const char *name, Error 
> > > > > > > > > **errp)
> > > > > > > > > +{
> > > > > > > > > +    VmGenIdState *s = VMGENID(obj);
> > > > > > > > 
> > > > > > > > Why cast to VMGENID here?
> > > > > > > Yep, there is no need to do it, I'll clean it up.
> > > > > > > 
> > > > > > > > 
> > > > > > > > > +    int64_t value = pci_get_bar_addr(PCI_DEVICE(s), 0);
> > > > > > > > > +
> > > > > > > > > +    if (value == PCI_BAR_UNMAPPED) {
> > > > > > > > > +        error_setg(errp, "'%s." VMGENID_VMGID_ADDR "': not
> > > > > > > > > initialized",
> > > > > > > > > +                   object_get_typename(OBJECT(s)));
> > > > > > > > 
> > > > > > > > This is guest error. Pls don't print these to monitor by 
> > > > > > > > default.
> > > > > > > Then how test case querying this property via QOM could get to 
> > > > > > > know
> > > > > > > that property is in wrong state yet?
> > > > > > 
> > > > > > Maybe leave this around for tests (with a comment)
> > > > > > but use plain pci_get_bar_addr internally?
> > > > > Accessing it internally as property will also allow to
> > > > > prevent guest starting if BIOS failed to initialize BAR
> > > > > (not implemented but shouldn't be hard to do)
> > > > > 
> > > > > [...]
> > > > 
> > > > I don't think it's a good idea. It's just a device,
> > > > it's not the most important thing for guests,
> > > > it's a policy question whether to initialize it.
> > > I don't think it's just policy, BIOS and ACPI in QEMU are tightly coupled
> > > and if BIOS is unable initialize devices as it's supposed to
> > > then I'd rather abort machine
> > 
> > Right, less work for us :)
> > But I'm guessing *users* would rather have a debuggable guest.
> > 
> > > with error message pointing to source
> > > of problem then silently continue boot and allow guest OS or even
> > > some guest application to guess what went wrong if they will be able
> > > to do so.
> > 
> > That's still up to guest.  BIOS can abort boot if it wants to.
> > 
> > 
> > > In this case Windows will continue to work just without using VGID
> > > possibly leading to duplicate keys on to 2 different VMs
> > > or something else (it's used not only for crypto).
> > 
> > windows will detect that it does not have VGID.
> > whether it's worth crashing in that case is up to windows, not us.
> > 
> > > Alternatively lets map page directly in QEMU before PCI hole
> > > without any PCI BARs and pass reservation via E820,
> > >  - it would solve issue with selecting PCI CLASS ID, it would be just 
> > > plain QEMU device
> > >  - no consuming of slot and/or addrX.functionY
> > >  - we would know immediately if device is correctly initialized
> > >    even before BIOS runs. i.e. no guest involved and with
> > >    clear end result.
> > 
> > Been there, done that.
> > Each time we try to steal memory, we get pain.
> We are stealing it any way just by using more complex means.

Not really.  guest reserves memory and gives us the address.
With linker this is standard DMA that happens with a ton of devices.
With pci this is standard BAR mapping.

> Is there any technical reasons/concerns why direct stealing
> from QEMU is bad and why is it better to used PCI/linker?
> 
> I'd really know answer instead of getting "just because it's bad".
> Gal was also curios regarding this question.

Simply put reserving RAM by hardware is not something
that happens on real hardware.
Rather it's bios that reserves RAM.


> 
> > Guests should allocate memory.
> > Either via PCI, or linker like Gal's patches do.
> > 
> > 
> > > > > > > > > +    k->vendor_id = PCI_VENDOR_ID_REDHAT;
> > > > > > > > > +    k->device_id = PCI_DEVICE_ID_REDHAT_VMGENID;
> > > > > > > > > +    k->class_id = PCI_CLASS_MEMORY_RAM;
> > > > > > > > 
> > > > > > > > Still looks scary.
> > > > > > > Can do nothing about it,
> > > > > > > it's closest class id to what this device is 
> > > > > > > (i.e. it exposes page of RAM) that works with Windows
> > > > > > > without asking for drivers.
> > > > > > > If that class id is not acceptable then let's drop PCI
> > > > > > > approach altogether.
> > > > > > >
> > > > > > > More over it's limited to target-i386 only and possibly
> > > > > > > could apply to ARM in the future when Windows comes there,
> > > > > > > so in this  case I'm not very concerned about pseries guests
> > > > > > 
> > > > > > I don't think we should treat this as a windows only device,
> > > > > > the function seems generally useful.
> > > > > > 
> > > > > > > especially with buggy kernel as it was reported in
> > > > > > > http://lists.gnu.org/archive/html/qemu-devel/2012-03/msg04704.html
> > > > > > 
> > > > > > I think it's firmware that's confused, not the guest kernel.
> > > > > maybe both, should we care about faulty guest pieces when they
> > > > > don't use this device. If the pseries would need to use it then
> > > > > they should fix guest size instead of poking soldering iron
> > > > > in HW.
> > > > 
> > > > It's better if we can avoid the issue altogether.
> > > > Assuming that we can't,
> > > > I'd like some confirmation from David Gibson on this.
> > > > 
> > > > 
> > > > > > 
> > > > > > > 
> > > > > > > [...]
> > > > > > 
> > > > > > 
> > > > > > Some options to think about/try
> > > > > > 1. PCI_CLASS_MEMORY_OTHER (or some other class?)
> > > > > I've already tried this and a number of others,
> > > > > Windows asks for driver.
> > > > > 
> > > > > > 2. Name(_HID, "PNP0A06") (or some other id)
> > > > > experiment on Windows shows that _HID doesn't influence PCI devices 
> > > > > described
> > > > > in ACPI in any way. In this version _HID = QEMU0003 and its required
> > > > > by VMGID spec to have unique vendor specific HID for VMGID device.
> > > > > It looks like PCI driver mostly ignores PCI slots described in ACPI
> > > > > and as result there are devices in device manager "PCI standard RAM 
> > > > > Ctrl"
> > > > > and "VM Gen ID" despite the fact that it's one Device(SXXX) {} in ACPI
> > > > > tables.
> > > > 
> > > > I see. Interesting.
> > > > And VM Gen ID isn't using the resources of the pci
> > > > device?
> > > Nope, resources are claimed by respective PCI device regardless of its
> > > presence in ACPI tables. VGID device just exposes ADDR so that Windows 
> > > could
> > > poke into that buffer
> > > 
> > > > Any other ideas? Mark it hidden?
> > > Gal's already checked, Windows doesn't hide VGID device.
> > > I think we should just leave 2 devices as is (i.e. shown), no harm in it.
> > > (I'd hide PCI device but it seems to be impossible, and it's anyway just 
> > > cosmetic)
> > >  
> > > [...]
> > 
> > I agree, what worries me is the driver prompt if we set class to
> If we use PCI_CLASS_MEMORY_RAM it won't ask for driver, even XP.
> Lets wait for Davids answer you've asked above.
> 
> > something generic, and guest confusion if we set it to RAM.
> I don't see guest confusion because it's RAM controller not a RAM.
> I've checked Linux sources as well, it also doesn't use this
> class id in any way so BAR just sits there.
> 
> Perhaps PCI_CLASS_MEMORY_RAM is just bad naming, it really should be
> PCI_CLASS_GENERIC_RAM_CONTROLLER if we are to be close to spec.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]