qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 3/4] io/channel-socket: implement non-blocking connect


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [PATCH 3/4] io/channel-socket: implement non-blocking connect
Date: Wed, 22 Jul 2020 18:56:20 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0

22.07.2020 18:43, Daniel P. Berrangé wrote:
On Wed, Jul 22, 2020 at 06:40:10PM +0300, Vladimir Sementsov-Ogievskiy wrote:
22.07.2020 18:21, Daniel P. Berrangé wrote:
On Wed, Jul 22, 2020 at 06:04:53PM +0300, Vladimir Sementsov-Ogievskiy wrote:
22.07.2020 16:47, Vladimir Sementsov-Ogievskiy wrote:
22.07.2020 15:53, Daniel P. Berrangé wrote:
On Wed, Jul 22, 2020 at 03:43:54PM +0300, Vladimir Sementsov-Ogievskiy wrote:
22.07.2020 14:21, Daniel P. Berrangé wrote:
On Wed, Jul 22, 2020 at 02:00:25PM +0300, Vladimir Sementsov-Ogievskiy wrote:
20.07.2020 21:29, Daniel P. Berrangé wrote:
On Mon, Jul 20, 2020 at 09:07:14PM +0300, Vladimir Sementsov-Ogievskiy wrote:
Utilize new socket API to make a non-blocking connect for inet sockets.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
      include/io/channel-socket.h | 14 +++++++
      io/channel-socket.c         | 74 +++++++++++++++++++++++++++++++++++++
      2 files changed, 88 insertions(+)

diff --git a/include/io/channel-socket.h b/include/io/channel-socket.h
index 777ff5954e..82e868bc02 100644
--- a/include/io/channel-socket.h
+++ b/include/io/channel-socket.h
@@ -94,6 +94,20 @@ int qio_channel_socket_connect_sync(QIOChannelSocket *ioc,
                                          SocketAddress *addr,
                                          Error **errp);
+/**
+ * qio_channel_socket_connect_non_blocking_sync:
+ * @ioc: the socket channel object
+ * @addr: the address to connect to
+ * @errp: pointer to a NULL-initialized error object
+ *
+ * Attempt to connect to the address @addr using non-blocking mode of
+ * the socket. Function is synchronous, but being called from
+ * coroutine context will yield during connect operation.
+ */
+int qio_channel_socket_connect_non_blocking_sync(QIOChannelSocket *ioc,
+                                                 SocketAddress *addr,
+                                                 Error **errp);
+
      /**
       * qio_channel_socket_connect_async:
       * @ioc: the socket channel object
diff --git a/io/channel-socket.c b/io/channel-socket.c
index e1b4667087..076de7578a 100644
--- a/io/channel-socket.c
+++ b/io/channel-socket.c
@@ -22,6 +22,7 @@
      #include "qapi/error.h"
      #include "qapi/qapi-visit-sockets.h"
      #include "qemu/module.h"
+#include "qemu/sockets.h"
      #include "io/channel-socket.h"
      #include "io/channel-watch.h"
      #include "trace.h"
@@ -29,6 +30,8 @@
      #define SOCKET_MAX_FDS 16
+static int qio_channel_socket_close(QIOChannel *ioc, Error **errp);
+
      SocketAddress *
      qio_channel_socket_get_local_address(QIOChannelSocket *ioc,
                                           Error **errp)
@@ -157,6 +160,77 @@ int qio_channel_socket_connect_sync(QIOChannelSocket *ioc,
          return 0;
      }
+static int qio_channel_inet_connect_non_blocking_sync(QIOChannelSocket *ioc,
+        InetSocketAddress *addr, Error **errp)
+{
+    Error *local_err = NULL;
+    struct addrinfo *infos, *info;
+    int sock = -1;
+
+    infos = inet_parse_connect_saddr(addr, errp);
+    if (!infos) {
+        return -1;
+    }

This call is blocking since it calls getaddrinfo whose design
offers no ability todo non-blocking DNS lookups. Given this
call, ...

Oh, that's bad, thanks for taking a look on that early stage!


+
+    for (info = infos; info != NULL; info = info->ai_next) {
+        bool in_progress;
+
+        error_free(local_err);
+        local_err = NULL;
+
+        sock = inet_connect_addr(addr, info, false, &in_progress, &local_err);
+        if (sock < 0) {
+            continue;
+        }
+
+        if (qio_channel_socket_set_fd(ioc, sock, &local_err) < 0) {
+            close(sock);
+            continue;
+        }
+
+        if (in_progress) {
+            if (qemu_in_coroutine()) {
+                qio_channel_yield(QIO_CHANNEL(ioc), G_IO_OUT);
+            } else {
+                qio_channel_wait(QIO_CHANNEL(ioc), G_IO_OUT);
+            }

...this is offering false assurances of being non-blocking.

If we don't want the current thread to be blocked then we
need to be using the existing qio_channel_socket_connect_async
method or similar. It uses a throw away background thread to
run the connection attempt, and then reports completion back
later, thus avoiding the getaddrinfo design flaw for the callers.

I explicitly didn't want to add an method like the impl in this
patch, because getaddrinfo dooms it and we already had bugs in
the pre-QIOChannel code where QEMU thought it was non-blocking
but wasn't due to getaddrinfo lookups.


IIUC, the main appeal of this method is that the non-blocking
nature is hidden from the caller who can continue to treat it
as a synchronous call and have the coroutine magic happen in
behind the scenes.

IOW, What's needed is a simple way to run the operation in a
thread, and sleep for completion while having the coroutine
yield.

I think this could likely be achieved with QIOTask with an
alternate impl of the qio_task_wait_thread() method that is
friendly to coroutines instead of being based on pthread
condition variable waits.

The most simple thing is just run qio_channel_socket_connect_sync in
a thread with help of thread_pool_submit_co() which is coroutine-friendly.
And this don't need any changes in io/channel.

Actually, I've started with such design, but decided that better use
non-blocking connect to not deal with cancelling the connecting thread
on shutdown.

I think, I'll resend based on thread_pool_submit_co().

===

Hmm, there is async getaddrinfo_a function.. What do you think of it?

It isn't portable, glibc only.

But seems simpler to use a thread than move to async interfaces everywhere.



Hmm.. Still, on shutdown, how to cancel this connect and getaddrinfo ? I'm not 
sure
how much time may getaddrinfo take, but connect can take about a minute. It's 
not really
good to wait for it on shutdown.

The intention was that if you don't want to carry on waiting for the
async operation to complete you just give and pretend it no longer
exists. Eventually it will fail or complete and the thread will exit.
The only important thing there is making sure that the callback you
are passing to the _async() method can cope with the cleanup when the
work eventually completes, even if you've given up.


At least it's not possible with thread_pool_submit_co as I wanted, because 
underlying
thread pool waits for all its threads to complete on exit.




I'm trying to use qio_channel_socket_connect_async().. But callback
is not called.

How to make it be executed? In tests/test-io-channel-socket.c it's
done by g_main_loop_new .. g_main_loop_run. But I need to yield.
socket_start_outgoing_migration uses qio_channel_socket_connect_async
as well, but is not doing any magic with g_main_loop. But it works. How?

The _async() impls uses  qio_task_run_in_thread to spawn the background
thread. When the thread finishes, it uses g_idle_add to invoke the
callback so that it runs in the context of the main thread, not the
background thread. So something needs to be running the main loop
in QEMU.

I came to same idea. But still, I don't see where g_main_loop is
run inside qemu_main_loop(). Only iothread_run() does it. But
what if we don't have iothreads?

There's no requirement to use g_main_loop, what matters is actually
that something runs the default GMainContext.

O, I see now, g_main_context_dispatch() in glib_pollfds_poll() does it, yes?

qemu_main_loop
satisfies this.

If you have a different GMainLoop that you want to use, then you
can pass its GMainContext into the _async() functions, and the
result will get dispatched from whatever thread runs that
GMainContext/GMainLoop. So you could use this to get the callback
to be invoked in your iothread context if that's desirable. If
a NULL GMainContext is passed to _async(), then the callback is
dispatched from qemu_main_loop() thread.


Thanks for your help!

--
Best regards,
Vladimir



reply via email to

[Prev in Thread] Current Thread [Next in Thread]