qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/1] migration: Terminate multifd threads on yank


From: Lukas Straub
Subject: Re: [PATCH 1/1] migration: Terminate multifd threads on yank
Date: Tue, 3 Aug 2021 08:25:46 +0000

On Tue, 3 Aug 2021 04:18:42 -0300
Leonardo Bras Soares Passos <leobras@redhat.com> wrote:

> Hello Lukas,
> 
> On Tue, Aug 3, 2021 at 3:42 AM Lukas Straub <lukasstraub2@web.de> wrote:
> > Hi,
> > There is an easier explanation: I forgot the send side of multifd
> > altogether (I thought it was covered by migration_channel_connect()).
> > So yank won't actually shutdown() the multifd sockets on the send side.  
> 
> If I could get that correctly, it seems to abort migration (and
> therefore close all fds) if the ft that ends up qio_channel_shutdown()
> get to sendmsg(), which can take a while.

How long is "can take a while"? Until some TCP connection times out?
That would mean that it is hanging somewhere else.

I mean in precopy migration the multifd send threads should be fully
utilized and always sending something until the migration finishes. In
that case it is likely that all the treads become stuck in
qio_channel_write_all() if the connection breaks silently (i.e.
discards packets or the destination is powered off, No connection
reset) since there are no TCP ACK's ariving from the destination side
-> kernel tcp buffer becomes full -> qio_channel_write_all() blocks.
Thus, shutdown() on the sockets should be enough to get the treads
unstuck and notice that the connection broke.

If something else hangs, the question is where...

> But it really does not close thew fds before that.

Note: shutdown() is not close().

> >
> > In the bugreport you wrote  
> > > (As a test, I called qio_channel_shutdown() in every multifd iochannel 
> > > and yank worked just fine, but I could not retry migration, because it 
> > > was still 'ongoing')  
> > That sounds like a bug in the error handling for multifd. But quickly
> > looking at the code, it should properly fail the migration.  
> 
> In the end, just asking each thread to just exit ended up getting me a
> smoother migration abort.
> >
> > BTW: You can shutdown outgoing sockets from outside of qemu with the
> > 'ss' utility, like this: 'sudo ss -K dst <destination ip> dport = 
> > <destination port>'  
> 
> Very nice tool, thanks for sharing!
> 
> >
> > Regards,
> > Lukas Straub  
> 
> Best regards,
> Leonardo Bras
> 



-- 

Attachment: pgpTfsg9FYX78.pgp
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]