qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] migration: check magic value for deciding the mapping of cha


From: manish.mishra
Subject: Re: [PATCH] migration: check magic value for deciding the mapping of channels
Date: Thu, 3 Nov 2022 14:50:25 +0530
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.4.1


On 01/11/22 9:15 pm, Daniel P. Berrangé wrote:
On Tue, Nov 01, 2022 at 09:10:14PM +0530, manish.mishra wrote:
On 01/11/22 8:21 pm, Daniel P. Berrangé wrote:
On Tue, Nov 01, 2022 at 02:30:29PM +0000, manish.mishra wrote:
Current logic assumes that channel connections on the destination side are
always established in the same order as the source and the first one will
always be the default channel followed by the multifid or post-copy
preemption channel. This may not be always true, as even if a channel has a
connection established on the source side it can be in the pending state on
the destination side and a newer connection can be established first.
Basically causing out of order mapping of channels on the destination side.
Currently, all channels except post-copy preempt send a magic number, this
patch uses that magic number to decide the type of channel. This logic is
applicable only for precopy(multifd) live migration, as mentioned, the
post-copy preempt channel does not send any magic number. Also, this patch
uses MSG_PEEK to check the magic number of channels so that current
data/control stream management remains un-effected.

Signed-off-by: manish.mishra <manish.mishra@nutanix.com>
---
   include/io/channel.h     | 25 +++++++++++++++++++++++++
   io/channel-socket.c      | 27 +++++++++++++++++++++++++++
   io/channel.c             | 39 +++++++++++++++++++++++++++++++++++++++
   migration/migration.c    | 33 +++++++++++++++++++++------------
   migration/multifd.c      | 12 ++++--------
   migration/multifd.h      |  2 +-
   migration/postcopy-ram.c |  5 +----
   migration/postcopy-ram.h |  2 +-
   8 files changed, 119 insertions(+), 26 deletions(-)

diff --git a/include/io/channel.h b/include/io/channel.h
index c680ee7480..74177aeeea 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -115,6 +115,10 @@ struct QIOChannelClass {
                           int **fds,
                           size_t *nfds,
                           Error **errp);
+    ssize_t (*io_read_peek)(QIOChannel *ioc,
+                            void *buf,
+                            size_t nbytes,
+                            Error **errp);
       int (*io_close)(QIOChannel *ioc,
                       Error **errp);
       GSource * (*io_create_watch)(QIOChannel *ioc,
@@ -475,6 +479,27 @@ int qio_channel_write_all(QIOChannel *ioc,
                             size_t buflen,
                             Error **errp);
+/**
+ * qio_channel_read_peek_all:
+ * @ioc: the channel object
+ * @buf: the memory region to read in data
+ * @nbytes: the number of bytes to read
+ * @errp: pointer to a NULL-initialized error object
+ *
+ * Read given @nbytes data from peek of channel into
+ * memory region @buf.
+ *
+ * The function will be blocked until read size is
+ * equal to requested size.
+ *
+ * Returns: 1 if all bytes were read, 0 if end-of-file
+ *          occurs without data, or -1 on error
+ */
+int qio_channel_read_peek_all(QIOChannel *ioc,
+                              void* buf,
+                              size_t nbytes,
+                              Error **errp);
+
   /**
    * qio_channel_set_blocking:
    * @ioc: the channel object
diff --git a/io/channel-socket.c b/io/channel-socket.c
index b76dca9cc1..b99f5dfda6 100644
--- a/io/channel-socket.c
+++ b/io/channel-socket.c
@@ -705,6 +705,32 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc,
   }
   #endif /* WIN32 */
+static ssize_t qio_channel_socket_read_peek(QIOChannel *ioc,
+                                            void *buf,
+                                            size_t nbytes,
+                                            Error **errp)
+{
+    QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc);
+    ssize_t bytes = 0;
+
+retry:
+    bytes = recv(sioc->fd, buf, nbytes, MSG_PEEK);
+
+    if (bytes < 0) {
+        if (errno == EINTR) {
+            goto retry;
+        }
+        if (errno == EAGAIN) {
+            return QIO_CHANNEL_ERR_BLOCK;
+        }
+
+        error_setg_errno(errp, errno,
+                         "Unable to read from peek of socket");
+        return -1;
+    }
+
+    return bytes;
+}
   #ifdef QEMU_MSG_ZEROCOPY
   static int qio_channel_socket_flush(QIOChannel *ioc,
@@ -902,6 +928,7 @@ static void qio_channel_socket_class_init(ObjectClass 
*klass,
       ioc_klass->io_writev = qio_channel_socket_writev;
       ioc_klass->io_readv = qio_channel_socket_readv;
+    ioc_klass->io_read_peek = qio_channel_socket_read_peek;
       ioc_klass->io_set_blocking = qio_channel_socket_set_blocking;
       ioc_klass->io_close = qio_channel_socket_close;
       ioc_klass->io_shutdown = qio_channel_socket_shutdown;
diff --git a/io/channel.c b/io/channel.c
index 0640941ac5..a2d9b96f3f 100644
--- a/io/channel.c
+++ b/io/channel.c
@@ -346,6 +346,45 @@ int qio_channel_write_all(QIOChannel *ioc,
       return qio_channel_writev_all(ioc, &iov, 1, errp);
   }
+int qio_channel_read_peek_all(QIOChannel *ioc,
+                              void* buf,
+                              size_t nbytes,
+                              Error **errp)
+{
+   QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc);
+   ssize_t bytes = 0;
+
+   if (!klass->io_read_peek) {
+       error_setg(errp, "Channel does not support read peek");
+       return -1;
+   }
There's no io_read_peek for  QIOChannelTLS, so we'll hit this
error...


diff --git a/migration/migration.c b/migration/migration.c
index 739bb683f3..f4b6f278a9 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -733,31 +733,40 @@ void migration_ioc_process_incoming(QIOChannel *ioc, 
Error **errp)
   {
       MigrationIncomingState *mis = migration_incoming_get_current();
       Error *local_err = NULL;
-    bool start_migration;
       QEMUFile *f;
+    bool default_channel = true;
+    uint32_t channel_magic = 0;
+    int ret = 0;
-    if (!mis->from_src_file) {
-        /* The first connection (multifd may have multiple) */
+    if (migrate_use_multifd() && !migration_in_postcopy()) {
+        ret = qio_channel_read_peek_all(ioc, (void *)&channel_magic,
+                                        sizeof(channel_magic), &local_err);
+
+        if (ret != 1) {
+            error_propagate(errp, local_err);
+            return;
+        }
....and thus this will fail for TLS channels AFAICT.
Yes, thanks for quick review Daniel. You pointed this earlier too, sorry missed 
it, will put another check !migrate_use_tls() in V2.
But we need this problem fixed with TLS too, so just excluding it
isn't right. IMHO we need to modify the migration code so we can
read the magic earlier, instead of peeking.


With regards,
Daniel

Hi Daniel, I was trying tls migrations. What i see is that tls session creation 
does handshake. So if we read ahead in ioc_process_incoming for default 
channel. Because client sends magic only after multiFD channels are setup, 
which too requires tls handshake. So basically on destination side we can not 
recieve default channel magic until multiFD tls handshake is done, so it kind 
of creates a dead lock. I can not think of anyway to re-arrange it. Also if in 
tls we do this handshake is this bug even possible for tls case? If not will 
just !migrate_use_tls() check be fine. I do not have much understanding of tls, 
if you think this bug is possible even with tls handshakes, I will check for 
another way.

Thanks

Manish Mishra




reply via email to

[Prev in Thread] Current Thread [Next in Thread]