Manish,
On Thu, Nov 03, 2022 at 11:47:51PM +0530, manish.mishra wrote:
Yes, but if we try to read early on main channel with tls enabled case it is an
issue. Sorry i may not have put above comment cleary. I will try to put
scenario step wise.
1. main channel is created and tls handshake is done for main channel.
2. Destionation side tries to read magic early on main channel in
migration_ioc_process_incoming but it is not yet sent by source.
3. Source has written magic to main channel file buffer but it is not yet
flushed, it is flushed first time in ram_save_setup, i mean data is sent on
channel only if qemu file buffer is full or explicitly flushed.
4. Source side blocks on multifd_send_sync_main in ram_save_setup before
flushing qemu file. But multifd_send_sync_main is blocked for sem_sync until
handshake is done for multiFD channels.
5. Destination side is still waiting for reading magic on main channel, so
unless we return from migration_ioc_process_incoming we can not accept new
channel, so handshake of multiFD channel is blocked.
6. So basically source is blocked on multiFD channels handshake before sending
data on main channel, but destination is blocked waiting for data before it can
acknowledge multiFD channels and do handshake, so it kind of creates a deadlock
situation.
Why is this issue only happening with TLS? It sounds like it'll happen as
long as multifd enabled.