qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v6 0/7] Add support for VM Generation ID


From: Ben Warren
Subject: Re: [Qemu-devel] [PATCH v6 0/7] Add support for VM Generation ID
Date: Wed, 15 Feb 2017 12:15:42 -0800

> On Feb 15, 2017, at 12:09 PM, Michael S. Tsirkin <address@hidden> wrote:
> 
> On Wed, Feb 15, 2017 at 08:47:48PM +0100, Laszlo Ersek wrote:
>> On 02/15/17 07:15, address@hidden wrote:
>>> From: Ben Warren <address@hidden>
>>> 
>>> This patch set adds support for passing a GUID to Windows guests.  It
>>> is a re-implementation of previous patch sets written by Igor Mammedov
>>> et al, but this time passing the GUID data as a fw_cfg blob.
>>> 
>>> This patch set has dependencies on new guest functionality, in
>>> particular the support for a new linker-loader command and the ability
>>> to write back data to QEMU over a DMA link.  Work is in flight in both
>>> SeaBIOS and OVMF to support this.
>>> 
>>> v5->v6:
>>>    - Rebased to top of tree.
>>>    - Changed device from sysbus to a simple device.  This removed the need 
>>> for
>>>      adding dynamic sysbus support to pc_piix boards.
>>>    - Removed patch that introduced QWORD patching of AML.
>>>    - Removed ability to set GUID via QMP/HMP.
>>>    - Improved comments/documentation in code.
>> 
>> So here's my testing with a RHEL-7 guest:
>> 
>> (1) The command line option passed to QEMU is
>> 
>>  -device vmgenid,guid=00112233-4455-6677-8899-AABBCCDDEEFF
>> 
>> This is the example GUID provided in the SMBIOS spec v3.0.0 (DSP0134),
>> section 7.2.1 "System -- UUID". (SMBIOS is only relevant here because it
>> codifies the fact that Microsoft consumes UUID in little-endian order.)
>> The expected representation, according to the SMBIOS spec, is
>> 
>>  33 22 11 00 55 44 77 66 88 99 AA BB CC DD EE FF
>> 
>> (2) Here's an excerpt from the OVMF log:
>> 
>>> ProcessCmdAllocate: File="etc/vmgenid_guid" Alignment=0x1000 Zone=1 
>>> Size=0x1000 Address=0x7FE5C000
>> 
>> This is where "etc/vmgenid_guid" is allocated and downloaded, the
>> allocation address is 0x7FE5C000.
>> 
>>> Select Item: 0x19
>>> Select Item: 0x22
>>> ProcessCmdAllocate: File="etc/acpi/tables" Alignment=0x40 Zone=1 
>>> Size=0x20000 Address=0x7E7AB000
>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x49 Start=0x40 
>>> Length=0x1403
>>> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" 
>>> PointeeFile="etc/acpi/tables" PointerOffset=0x1467 PointerSize=4
>>> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" 
>>> PointeeFile="etc/acpi/tables" PointerOffset=0x146B PointerSize=4
>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x144C 
>>> Start=0x1443 Length=0x74
>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x14C0 
>>> Start=0x14B7 Length=0x80
>>> Select Item: 0x19
>>> SaveCondensedWritePointerToS3Context: 0x002B/[0x00000000+8] := 0x7FE5C000 
>>> (0)
>> 
>> This is where OVMF stashes the WRITE_POINTER command in "condensed"
>> form, for S3. The fw_cfg selector value is 0x2B (for the fw_cfg file to
>> be rewritten), the pointer is located at offset 0, has size 0, and the
>> value to assign is 0x7FE5C000. And, this is #0 of the saved / condensed
>> WRITE_POINTER commands.
>> 
>>> Select Item: 0x2B
>>> ProcessCmdWritePointer: PointerFile="etc/vmgenid_addr" 
>>> PointeeFile="etc/vmgenid_guid" PointerOffset=0x0 PointerSize=8
>> 
>> This is where the WRITE_POINTER command is actually executed, during
>> normal boot.
>> 
>>> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" 
>>> PointeeFile="etc/vmgenid_guid" PointerOffset=0x1561 PointerSize=4
>> 
>> This is where we link "etc/vmgenid_guid" into VGIA.
>> 
>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x1540 
>>> Start=0x1537 Length=0xCA
>>> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" 
>>> PointeeFile="etc/acpi/tables" PointerOffset=0x1625 PointerSize=4
>>> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" 
>>> PointeeFile="etc/acpi/tables" PointerOffset=0x1629 PointerSize=4
>>> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" 
>>> PointeeFile="etc/acpi/tables" PointerOffset=0x162D PointerSize=4
>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x160A 
>>> Start=0x1601 Length=0x30
>>> ProcessCmdAddPointer: PointerFile="etc/acpi/rsdp" 
>>> PointeeFile="etc/acpi/tables" PointerOffset=0x10 PointerSize=4
>>> ProcessCmdAddChecksum: File="etc/acpi/rsdp" ResultOffset=0x8 Start=0x0 
>>> Length=0x24
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" 
>>> at 0x7E7AB000 (remaining: 0x20000): found "FACS" size 0x40
>>> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" 
>>> at 0x7E7AB040 (remaining: 0x1FFC0): found "DSDT" size 0x1403
>>> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/vmgenid_guid" 
>>> at 0x7FE5C000 (remaining: 0x1000): not found; marking fw_cfg blob as opaque
>> 
>> This is where the OVMF SDT Header Probe Suppressor does its job. (NB,
>> the "opaque marking" has happened already in ProcessCmdWritePointer()
>> too, above.)
>> 
>>> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" 
>>> at 0x7E7AC443 (remaining: 0x1EBBD): found "FACP" size 0x74
>>> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" 
>>> at 0x7E7AC4B7 (remaining: 0x1EB49): found "APIC" size 0x80
>>> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" 
>>> at 0x7E7AC537 (remaining: 0x1EAC9): found "SSDT" size 0xCA
>>> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" 
>>> at 0x7E7AC601 (remaining: 0x1E9FF): found "RSDT" size 0x30
>>> TransferS3ContextToBootScript: boot script fragment saved, 
>>> ScratchBuffer=7FE4F018
>> 
>> This is where the WRITE_POINTER commands, stashed earlier in condensed
>> form, are translated to S3 Boot Script opcodes.
>> 
>>> InstallQemuFwCfgTables: installed 5 tables
>> 
>> Such as: FACS, DSDT, FACP, APIC, SSDT. OVMF recognizes RSDT and ignores
>> it (it's handled by edk2 automatically).
>> 
>>> InstallQemuFwCfgTables: freeing "etc/acpi/rsdp"
>>> InstallQemuFwCfgTables: freeing "etc/acpi/tables"
>> 
>> OVMF sees that the above two blobs have not been marked as "opaque" --
>> they only contained ACPI tables, judged from the ADD_POINTER commands
>> that pointed into them. So these two blobs are freed.
>> 
>> Note that "etc/vmgenid_guid" is not freed.
>> 
>> So, from the firmware log, everything looks OK.
>> 
>> (3) I dumped the SSDT in the RHEL-7 guest:
>> 
>>> /*
>>> * Intel ACPI Component Architecture
>>> * AML/ASL+ Disassembler version 20160527-64
>>> * Copyright (c) 2000 - 2016 Intel Corporation
>>> *
>>> * Disassembling to symbolic ASL+ operators
>>> *
>>> * Disassembly of ssdt.dat, Wed Feb 15 19:21:11 2017
>>> *
>>> * Original Table Header:
>>> *     Signature        "SSDT"
>>> *     Length           0x000000CA (202)
>>> *     Revision         0x01
>>> *     Checksum         0x1D
>>> *     OEM ID           "BOCHS "
>>> *     OEM Table ID     "VMGENID"
>>> *     OEM Revision     0x00000001 (1)
>>> *     Compiler ID      "BXPC"
>>> *     Compiler Version 0x00000001 (1)
>>> */
>>> DefinitionBlock ("", "SSDT", 1, "BOCHS ", "VMGENID", 0x00000001)
>>> {
>>>    Name (VGIA, 0x7FE5C000)
>> 
>> Note that the value matches the value logged by the firmware in (2).
>> 
>>>    Scope (\_SB)
>>>    {
>>>        Device (VGEN)
>>>        {
>>>            Name (_HID, "QEMUVGID")  // _HID: Hardware ID
>>>            Name (_CID, "VM_Gen_Counter")  // _CID: Compatible ID
>>>            Name (_DDN, "VM_Gen_Counter")  // _DDN: DOS Device Name
>>>            Method (_STA, 0, NotSerialized)  // _STA: Status
>>>            {
>>>                Local0 = 0x0F
>>>                If (VGIA == Zero)
>>>                {
>>>                    Local0 = Zero
>>>                }
>>> 
>>>                Return (Local0)
>>>            }
>>> 
>>>            Method (ADDR, 0, NotSerialized)
>>>            {
>>>                Local0 = Package (0x02) {}
>>>                Local0 [Zero] = (VGIA + 0x28)
>>>                Local0 [One] = Zero
>>>                Return (Local0)
>>>            }
>>>        }
>>>    }
>>> 
>>>    Method (\_GPE._E05, 0, NotSerialized)  // _Exx: Edge-Triggered GPE
>>>    {
>>>        Notify (\_SB.VGEN, 0x80) // Status Change
>>>    }
>>> }
>> 
>> Looks good and matches the documentation.
>> 
>> (4) To be sure, I checked the address against the guest dmesg, which
>> contains a dump of the UEFI memory map:
>> 
>>> [    0.000000] efi: mem52: type=10, attr=0xf, 
>>> range=[0x000000007fe5a000-0x000000007fe5e000) (0MB)
>> 
>> The page (4096 bytes) at 0x7FE5C000 falls into this range. Type=10 means
>> EfiACPIMemoryNVS.
>> 
>> (5) At this point I dumped the guest RAM with the dump-guest-memory
>> monitor command, opened it with "crash", and listed it:
>> 
>>> crash> rd -p -8 0x7FE5C000 0x40
>>>        7fe5c000:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   
>>> ................
>>>        7fe5c010:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   
>>> ................
>>>        7fe5c020:  00 00 00 00 00 00 00 00 33 22 11 00 55 44 77 66   
>>> ........3"..UDwf
>>>        7fe5c030:  88 99 aa bb cc dd ee ff 00 00 00 00 00 00 00 00   
>>> ................
>> 
>> We can see that the GUID starts at 0x7FE5C000 + 0x28, and also that the
>> byte-level representation matches the little endian one given in (1).
>> 
>> This proves that the initial blob download worked fine.
>> 
>> (6) Here I attached "gdb" to QEMU, set a breakpoint on
>> vmgenid_handle_reset(), allowed the inferior process to continue
>> execution.
>> 
>> Then I suspended and resumed the guest (ACPI S3). The breakpoint was hit
>> during resume:
>> 
>>> Breakpoint 1, vmgenid_handle_reset (opaque=0x7f2bd03c36e0) at 
>>> .../hw/acpi/vmgenid.c:205
>>> 205         VmGenIdState *vms = VMGENID(opaque);
>> 
>> First of all, before allowing QEMU to zero out the address blob, I
>> listed the address and the contents of the address blob (here exploiting
>> that my host is also little endian):
>> 
>>> (gdb) print (void*)vms->vmgenid_addr_le
>>> $2 = (void *) 0x7f2bd03c37b0
>> 
>>> (gdb) print /x *(uint64_t*)vms->vmgenid_addr_le
>>> $4 = 0x7fe5c000
>> 
>> This proves that QEMU has the right address, matching the firmware log
>> from (2), and the ACPI dump from (3).
>> 
>> (7) At this point I allowed the inferior to proceed a bit:
>> 
>>> (gdb) n
>>> 207         memset(vms->vmgenid_addr_le, 0, 
>>> ARRAY_SIZE(vms->vmgenid_addr_le));
>>> (gdb) n
>>> 208     }
>> 
>> I verified that the blob was zeroed:
>> 
>>> (gdb) print /x *(uint64_t*)vms->vmgenid_addr_le
>>> $5 = 0x0
>> 
>> then allowed the inferior to run free.
>> 
>>> (gdb) cont
>>> Continuing.
>> 
>> (8) New messages appeared in the firmware log:
>> 
>>> S3ResumeExecuteBootScript()
>>> PeiS3ResumeState - 7FF92B18
>>> transfer control to Standalone Boot Script Executor
>>> S3BootScriptExecute:
>>> TableHeader - 0x7E7A7000
>>> TableHeader.Version - 0x0001
>>> TableHeader.TableLength - 0x000000ED
>>> ExecuteBootScript - 7E7A700D
>>> EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE
>>> BootScriptExecuteMemoryWrite - 0x7FE4F018, 0x00000010, 0x00000000
>> 
>> Here the ACPI S3 Boot Script, prepared in
>> TransferS3ContextToBootScript() -- see (2) -- creates a DMA access
>> command for fw_cfg. The DMA access command is written to pre-reserved
>> memory (see "ScratchBuffer" above).
>> 
>>> S3BootScriptWidthUint8 - 0x7FE4F018 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F019 (0x2B)
>> 
>> The fw_cfg selector is 0x2B. (See under (2).)
>> 
>>> S3BootScriptWidthUint8 - 0x7FE4F01A (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F01B (0x0C)
>> 
>> This is a combined select+skip operation.
>> 
>>> S3BootScriptWidthUint8 - 0x7FE4F01C (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F01D (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F01E (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F01F (0x00)
>> 
>> The skip size is 0 bytes.
>> 
>>> S3BootScriptWidthUint8 - 0x7FE4F020 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F021 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F022 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F023 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F024 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F025 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F026 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F027 (0x00)
>> 
>> The address is irrelevant for skip, so it's just nuleld.
>> 
>>> ExecuteBootScript - 7E7A7030
>>> EFI_BOOT_SCRIPT_IO_WRITE_OPCODE
>>> BootScriptExecuteIoWrite - 0x00000514, 0x00000002, 0x00000002
>>> S3BootScriptWidthUint32 - 0x00000514 (0x00000000)
>>> S3BootScriptWidthUint32 - 0x00000518 (0x18F0E47F)
>> 
>> The Boot Script passes the DMA command to QEMU, by writing the address
>> of the command buffer to IO ports 0x514 and 0x518, in BE byte order.
>> 
>>> ExecuteBootScript - 7E7A704B
>>> EFI_BOOT_SCRIPT_MEM_POLL_OPCODE
>>> BootScriptExecuteMemPoll - 0x7FE4F018, 0x00000000FFFFFFFF, 
>>> 0x0000000000000000
>>> S3BootScriptWidthUint32 - 0x7FE4F018
>>> ExecuteBootScript - 7E7A7072
>> 
>> This waits until the DMA command succeeds (reading back the Control
>> field continuously until it reads as zero).
>> 
>>> EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE
>>> BootScriptExecuteMemoryWrite - 0x7FE4F018, 0x00000018, 0x00000000
>> 
>> This is another DMA access command for fw_cfg, prepared in the same
>> pre-reserved buffer. This time
>> 
>>> S3BootScriptWidthUint8 - 0x7FE4F018 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F019 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F01A (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F01B (0x10)
>> 
>> we request a write operation,
>> 
>>> S3BootScriptWidthUint8 - 0x7FE4F01C (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F01D (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F01E (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F01F (0x08)
>> 
>> with a length of 8 bytes (big endian), matching the pointer size,
>> 
>>> S3BootScriptWidthUint8 - 0x7FE4F020 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F021 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F022 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F023 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F024 (0x7F)
>>> S3BootScriptWidthUint8 - 0x7FE4F025 (0xE4)
>>> S3BootScriptWidthUint8 - 0x7FE4F026 (0xF0)
>>> S3BootScriptWidthUint8 - 0x7FE4F027 (0x28)
>> 
>> the data to transfer is located at 0x7FE4F028 (just below, tacked to the
>> command buffer itself),
>> 
>>> S3BootScriptWidthUint8 - 0x7FE4F028 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F029 (0xC0)
>>> S3BootScriptWidthUint8 - 0x7FE4F02A (0xE5)
>>> S3BootScriptWidthUint8 - 0x7FE4F02B (0x7F)
>>> S3BootScriptWidthUint8 - 0x7FE4F02C (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F02D (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F02E (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F02F (0x00)
>> 
>> and the data to write is the original allocation address of the blob
>> (0x7fe5c000).
>> 
>>> ExecuteBootScript - 7E7A709D
>>> EFI_BOOT_SCRIPT_IO_WRITE_OPCODE
>>> BootScriptExecuteIoWrite - 0x00000514, 0x00000002, 0x00000002
>>> S3BootScriptWidthUint32 - 0x00000514 (0x00000000)
>>> S3BootScriptWidthUint32 - 0x00000518 (0x18F0E47F)
>>> ExecuteBootScript - 7E7A70B8
>>> EFI_BOOT_SCRIPT_MEM_POLL_OPCODE
>>> BootScriptExecuteMemPoll - 0x7FE4F018, 0x00000000FFFFFFFF, 
>>> 0x0000000000000000
>>> S3BootScriptWidthUint32 - 0x7FE4F018
>>> ExecuteBootScript - 7E7A70DF
>> 
>> Same story as above: fire off the transfer and wait until it completes.
>> 
>>> EFI_BOOT_SCRIPT_INFORMATION_OPCODE
>>> BootScriptExecuteInformation - 0x7E7A70E6
>>> BootScriptInformation: DE AD BE EF
>>> ExecuteBootScript - 7E7A70EA
>>> S3_BOOT_SCRIPT_LIB_TERMINATE_OPCODE
>>> S3BootScriptDone - Success
>>> [...]
>> 
>> The DEADBEEF informational (no-op) opcode is something that OVMF appends
>> to the very end for hysterical raisins.
>> 
>> (9) Okay, so the guest is now resumed and running, let's interrupt it in
>> gdb again, and check the contents of address blob again (we know the
>> address of the address blob from step (6)):
>> 
>>> ^C
>>> Program received signal SIGINT, Interrupt.
>>> 0x00007f2bbf1d1ebf in ppoll () from /lib64/libc.so.6
>>> (gdb) print /x *(uint64_t*)0x7f2bd03c37b0
>>> $6 = 0x7fe5c000
>> 
>> Et voila.
>> 
>> (10) I detached gdb from QEMU, and issued the following monitor command:
>> 
>>> $ virsh qemu-monitor-command ovmf.rhel7 --hmp 'info vm-generation-id'
>>> 00112233-4455-6677-8899-aabbccddeeff
>> 
>> (11) I also booted a Windows Server 2012 R2 guest (Q35, broadcast SMI
>> enabled) with a similar vmgenid device/parameter. According to Device
>> Manager | System devices, "Microsoft Hyper-V Generation Counter" is
>> working properly.
>> 
>> I also tested S3 briefly, it worked okay. (I mentioned the SMI broadcast
>> above because for that, OVMF generates an independent S3 Boot Script
>> fragment.)
>> 
>> 
>> I'll let someone else test live migration.
>> 
>> For patches #1, #3, #4 and #5:
>> 
>> Tested-by: Laszlo Ersek <address@hidden>
>> 
>> I'll soon post the OVMF patches.
>> 
>> Thanks!
>> Laszlo
> 
> 
> How do you feel about Igor's request to change WRITE_POINTER to add
> offset in there, so guest can pass in the address of GUID and
> not start of table? Would that be a lot of work to add?
> 
I know you’re asking Laszlo, but hopefully the answer is “OK”.  I’m finished 
making those changes to QEMU and am coding up SeaBIOS right now. 
> -- 
> MST

Attachment: smime.p7s
Description: S/MIME cryptographic signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]