qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v8 4/8] ACPI: Add Virtual Machine Generation ID


From: Laszlo Ersek
Subject: Re: [Qemu-devel] [PATCH v8 4/8] ACPI: Add Virtual Machine Generation ID support
Date: Tue, 21 Feb 2017 10:58:05 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.1

On 02/21/17 02:43, Michael S. Tsirkin wrote:
> On Mon, Feb 20, 2017 at 09:55:40PM +0100, Laszlo Ersek wrote:
>> On 02/20/17 21:45, Eric Blake wrote:
>>> On 02/20/2017 02:19 PM, Dr. David Alan Gilbert wrote:
>>>> * Eric Blake (address@hidden) wrote:
>>>>> On 02/20/2017 04:23 AM, Dr. David Alan Gilbert wrote:
>>>>>> * Laszlo Ersek (address@hidden) wrote:
>>>>>>> CC Dave
>>>>>>
>>>>>> This isn't an area I really understand; but if I'm
>>>>>> reading this right then 
>>>>>>    vmgenid is stored in fw_cfg?
>>>>>>    fw_cfg isn't migrated
>>>>>>
>>>>>> So why should any changes to it get migrated, except if it's already
>>>>>> been read by the guest (and if the guest reads it again aftwards what's
>>>>>> it expected to read?)
>>>>>
>>>>> Why are we expecting it to change on migration?  You want a new value
>>>>
>>>> I'm not; I was asking why a change made prior to migration would be
>>>> preserved across migration.
>>>
>>> Okay, so you're asking what happens if the source requests the vmgenid
>>> device, and sets an id, but the destination of the migration does not
>>> request anything
>>
>> This should never happen, as it means different QEMU command lines on
>> source vs. target hosts. (Different as in "incorrectly different".)
>>
>> Dave writes, "a change made prior to migration". Change made to what?
>>
>> - the GUID cannot be changed via the monitor once QEMU has been started.
>> We dropped the monitor command for that, due to lack of a good use case,
>> and due to lifecycle complexities. We have figured out a way to make it
>> safe, but until there's a really convincing use case, we shouldn't add
>> that complexity.
> 
> True but we might in the future, and it seems prudent to make
> migration stream future-proof for that.

It is already.

The monitor command, if we add it, can be implemented incrementally. I
described it as "approach (iii)" elsewhere in the thread. This is a more
detailed recap:

- introduce a new device property (internal only), such as
  "x-enable-set-vmgenid". Make it reflect whether a given machine type
  supports the monitor command.

- change the /etc/vmgenid_guid fw_cfg blob from callback-less to one
  with a selection callback

- add a new boolean latch to the vmgenid device, called
  "guid_blob_selected" or something similar

- the reset handler sets the latch to FALSE
  (NB: the reset handler already sets /etc/vmgenid_addr to zero)

- the select callback for /etc/vmgenid_guid sets the latch to TRUE

- the latch is added to the migration stream as a subsection *if*
  x-enable-set-vmgenid is TRUE

- the set-vmgenid monitor command checks all three of:
  x-enable-set-vmgenid, the latch, and the contents of
  /etc/vmgenid_addr:

  - if x-enable-set-vmgenid is FALSE, the monitor command returns
    QERR_UNSUPPORTED (this is a generic error class, with an
    "unsupported" error message). Otherwise,

  - if the latch is TRUE *and* /etc/vmgenid_addr is zero, then the
    guest firmware has executed (or started executing) ALLOCATE for
    /etc/vmgenid_guid, but it has not executed WRITE_POINTER yet.
    In this case updating the VMGENID from the monitor is unsafe
    (we cannot guarantee informing the guest successfully), so in this
    case the monitor command fails with ERROR_CLASS_DEVICE_NOT_ACTIVE.
    The caller should simply try a bit later. (By which time the
    firmware will likely have programmed /etc/vmgenid_addr.)

    Libvirt can recognize this error specifically, because it is not the
    generic error class. ERROR_CLASS_DEVICE_NOT_ACTIVE stands for
    "EAGAIN", practically, in this case.

  - Otherwise -- meaning latch is FALSE *or* /etc/vmgenid_addr is
    nonzero, that is, the guest has either not run ALLOCATE since
    reset, *or* it has, but it has also run WRITE_POINTER):

    - refresh the GUID within the fw_cfg blob for /etc/vmgenid_guid
      in-place -- the guest will see this whenever it runs ALLOCATE for
      /etc/vmgenid_guid, *AND*

    - if /etc/vmgenid_addr is not zero, then update the guest (that is,
      RAM write + SCI)

Thanks
Laszlo

> 
>> - the address of the GUID is changed (the firmware programs it from
>> "zero" to an actual address, in a writeable fw_cfg file), and that piece
>> of info is explicitly migrated, as part of the vmgenid device's vmsd.
>>
>> Thanks
>> Laszlo
>>
>>
>>> - how does the guest on the destination see the same id
>>> as was in place on the source at the time migration started.
>>>
>>>>
>>>>
>>>>> when you load state from disk (you don't know how many times the same
>>>>> state has been loaded previously, so each load is effectively forking
>>>>> the VM and you want a different value), but for a single live migration,
>>>>> you aren't forking the VM and don't need a new generation ID.
>>>>>
>>>>> I guess it all boils down to what command line you're using: if libvirt
>>>>> is driving a live migration, it will request the same UUID in the
>>>>> command line of the destination as what is on the source; while if
>>>>> libvirt is loading from a [managed]save to restore state from a file, it
>>>>> will either request a new UUID directly or request auto to let qemu
>>>>> generate the new id.
>>>>
>>>> Hmm now I've lost it a bit; I thought we would preserve the value
>>>> transmitted from the source, not the value on the command line of the 
>>>> destination.
>>>
>>> I guess I'm trying to figure out whether libvirt MUST read the current
>>> id and explicitly tell the destination of migration to reuse that id, or
>>> if libvirt can omit the id on migration and everything just works
>>> because the id was migrated from the source.
>>>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]