This brings to mind how iterative migration will work. The interface for
iterative migration is basically the same as non-iterative migration
plus a method to query the number of bytes remaining. When the number of
bytes falls below a threshold, the vCPUs are stopped and the remainder
of the data is read.
Some details from VFIO migration:
- The VMM must explicitly change the state when transitioning from
iterative and non-iterative migration, but the data transfer fd
remains the same.
- The state of the device (running, stopped, resuming, etc) doesn't
change asynchronously, it's always driven by the VMM. However, setting
the state can fail and then the new state may be an error state.
Mapping this to SET_DEVICE_STATE_FD:
- VhostDeviceStatePhase is extended with
VHOST_TRANSFER_STATE_PHASE_RUNNING = 1 for iterative migration. The
frontend sends SET_DEVICE_STATE_FD again with
VHOST_TRANSFER_STATE_PHASE_STOPPED when entering non-iterative
migration and the frontend sends the iterative fd from the previous
SET_DEVICE_STATE_FD call to the backend. The backend may reply with
another fd, if necessary. If the backend changes the fd, then the
contents of the previous fd must be fully read and transferred before
the contents of the new fd are migrated. (Maybe this is too complex
and we should forbid changing the fd when going from RUNNING ->
STOPPED.)
- CHECK_DEVICE_STATE can be extended to report the number of bytes
remaining. The semantics change so that CHECK_DEVICE_STATE can be
called while the VMM is still reading from the fd. It becomes:
enum CheckDeviceStateResult {
Saving(bytes_remaining : usize),
Failed(error_code : u64),
}