qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v11 08/11] vfio/migration: Implement VFIO migration protocol


From: Avihai Horon
Subject: Re: [PATCH v11 08/11] vfio/migration: Implement VFIO migration protocol v2
Date: Thu, 16 Feb 2023 18:40:58 +0200
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.6.1


On 16/02/2023 17:43, Juan Quintela wrote:
External email: Use caution opening links or attachments


Avihai Horon <avihaih@nvidia.com> wrote:

Reviewed-by: Juan Quintela <quintela@redhat.com>.


1st comment is a bug, but as you just remove it.


Not that it matters a lot in the end (you are removing v1 on the
following patch).

@@ -438,7 +445,13 @@ static bool 
vfio_devices_all_running_and_mig_active(VFIOContainer *container)
                  return false;
              }

-            if (migration->device_state_v1 & VFIO_DEVICE_STATE_V1_RUNNING) {
+            if (!migration->v2 &&
+                migration->device_state_v1 & VFIO_DEVICE_STATE_V1_RUNNING) {
+                continue;
+            }
Are you missing here:


                } else {
                    return false;
                }

Or I am missunderstanding the code.

I think the code is OK:

If the device uses v1 migration and is running, the first if is true and we continue. If the device uses v1 migration and is not running, the first if is false and the second if is false (since the device doesn't use v2 migration) and we return false.

If the device uses v2 migration and is running, the first if is false and the second if is true, so we continue. If the device uses v2 migration and is not running, first if is false and the second if is false, so we return false.


+            if (migration->v2 &&
+                migration->device_state == VFIO_DEVICE_STATE_RUNNING) {
                  continue;
              } else {
                  return false;

[...]



+/*
+ * This is an arbitrary size based on migration of mlx5 devices, where 
typically
+ * total device migration size is on the order of 100s of MB. Testing with
+ * larger values, e.g. 128MB and 1GB, did not show a performance improvement.
+ */
+#define VFIO_MIG_DEFAULT_DATA_BUFFER_SIZE (1 * MiB)
Just in case that it makes a difference.  You are using here a 1MB
buffer.
But qemu_file_get_to_fd() uses a 32KB buffer.

Yes, I'm aware of that.
Down the road we will address the performance subject more thoroughly.
In the meantime, it seems like a reasonable size according to the tests we did.



      }
  }

+static int vfio_query_stop_copy_size(VFIODevice *vbasedev,
+                                     uint64_t *stop_copy_size)
+{
+    uint64_t buf[DIV_ROUND_UP(sizeof(struct vfio_device_feature) +
+                              sizeof(struct vfio_device_feature_mig_data_size),
+                              sizeof(uint64_t))] = {};
+    struct vfio_device_feature *feature = (struct vfio_device_feature *)buf;
suggestion:

        size_t size = DIV_ROUND_UP(sizeof(struct vfio_device_feature) +
                                  sizeof(struct 
vfio_device_feature_mig_data_size),
                                  sizeof(uint64_t));
        g_autofree struct vfio_device_feature *feature = g_malloc0(size * 
sizeof(uint64_t));

I have to reread several times to see what was going on there.


And call it a day?

I don't have a strong opinion here.
We can do whatever you/Alex like more.

Thanks for reviewing!




reply via email to

[Prev in Thread] Current Thread [Next in Thread]