libvfio-user-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: support live migration in NVMe/vfio-user


From: Thanos Makatos
Subject: RE: support live migration in NVMe/vfio-user
Date: Mon, 11 Jan 2021 10:08:41 +0000

I'm not concerned exactly with which specific SPDK release it will be released, 
for now let's focus on the technical details. Have you had a look at the 
libvfio-user migration API?

-----Original Message-----
From: Liu, Changpeng <changpeng.liu@intel.com> 
Sent: 11 January 2021 07:00
To: Thanos Makatos <thanos.makatos@nutanix.com>
Cc: libvfio-user-devel@nongnu.org; John Levon <john.levon@nutanix.com>; Swapnil 
Ingle <swapnil.ingle@nutanix.com>; john.g.johnson@oracle.com; Walker, Benjamin 
<benjamin.walker@intel.com>
Subject: RE: support live migration in NVMe/vfio-user

Hi Thanos,

I can help on this part, I had an internal project uses kernel vfio + PF NVMe 
to do the migration, so it should have much similar data structures.

Do you have any plan on it? I'm thinking for coming SPDK 21.01 release, the 
migration can't catch the code freeze, so the migration feature may
be release with next release.


> -----Original Message-----
> From: Thanos Makatos <thanos.makatos@nutanix.com>
> Sent: Friday, January 8, 2021 7:39 PM
> To: Liu, Changpeng <changpeng.liu@intel.com>
> Cc: libvfio-user-devel@nongnu.org; John Levon <john.levon@nutanix.com>;
> Swapnil Ingle <swapnil.ingle@nutanix.com>; john.g.johnson@oracle.com;
> Walker, Benjamin <benjamin.walker@intel.com>
> Subject: support live migration in NVMe/vfio-user
> 
> Changpeng, I've started looking at how to implement live migration for
> NVMe/vfio-user. Live migration in libvfio-user is implemented using the VFIO 
> live
> migration protocol but this complexity is somewhat hidden to the user; we
> present a slightly simpler API. The server needs to implement a set of 
> callbacks
> that handle migration device state (running, pre-copy, stop-and-copy, 
> resuming)
> and another set of callbacks that allow the client to read/write device state 
> (as
> it's the client responsible for forwarding device state to the destination). 
> In
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_nutanix_libvfio-2Duser_blob_master_samples_client.c&d=DwIF-g&c=s883GpUCOChKOHiocYtGcg&r=XTpYsh5Ps2zJvtw6ogtti46atk736SI4vgsJiUKIyDE&m=rJj8t-1FZUozH8vjROUQ1yuySbXyNbSZMMLLQREsk5s&s=CtZFnwcRGITQHEzN0EVkTySVnts9QgwIxxbe_lDy6yM&e=
>   and
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_nutanix_libvfio-2Duser_blob_master_samples_server.c&d=DwIF-g&c=s883GpUCOChKOHiocYtGcg&r=XTpYsh5Ps2zJvtw6ogtti46atk736SI4vgsJiUKIyDE&m=rJj8t-1FZUozH8vjROUQ1yuySbXyNbSZMMLLQREsk5s&s=wYYYX8kwZj-YYPNxWOw5_2tq99YkkrObH04WkHadWpI&e=
>   I've
> implemented a simple sample of how migration works in libvfio-user (it doesn't
> use a pre-copy scenario, only a stop-and-copy one).
> 
> Since VFIO live migration is not yet supported in mpqemu, I've been looking at
> implementing a utility to drive the migration (based on
> examples/nvme/identify/identify.c), I've put this code here:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_tmakatos_spdk_tree_migr&d=DwIF-g&c=s883GpUCOChKOHiocYtGcg&r=XTpYsh5Ps2zJvtw6ogtti46atk736SI4vgsJiUKIyDE&m=rJj8t-1FZUozH8vjROUQ1yuySbXyNbSZMMLLQREsk5s&s=SyxZK2nqhQceSJmtXY5bNXG4c00-bGFImrEoaseSsJY&e=
>   (it contains lots of hacks, as
> usual ;)). I got as far as the client setting the device state to 
> stop-and-copy (the
> server asserts in the device state change callback). We need to start thinking
> what state we need to send to the server, how/when we can quiesce queues,
> how the new server resume from stored state, etc. Also, we should look at the
> NVMe spec to see whether there are mechanisms that could allow us to simplify
> the implementation.
> 
> Is there some specific state SPDK stores for a controller? I guess we could 
> start
> with implementing the stop-and-copy phase.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]