qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 4/5] migration: implement bi-directional RDMA


From: Daniel P . Berrangé
Subject: Re: [Qemu-devel] [PATCH v2 4/5] migration: implement bi-directional RDMA QIOChannel
Date: Fri, 27 Apr 2018 10:16:03 +0100
User-agent: Mutt/1.9.2 (2017-12-15)

On Fri, Apr 27, 2018 at 03:56:38PM +0800, 858585 jemmy wrote:
> On Fri, Apr 27, 2018 at 1:36 AM, Dr. David Alan Gilbert
> <address@hidden> wrote:
> > * Lidong Chen (address@hidden) wrote:
> >> This patch implements bi-directional RDMA QIOChannel. Because different
> >> threads may access RDMAQIOChannel concurrently, this patch use RCU to 
> >> protect it.
> >>
> >> Signed-off-by: Lidong Chen <address@hidden>
> >
> > I'm a bit confused by this.
> >
> > I can see it's adding RCU to protect the rdma structures against
> > deletion from multiple threads; that I'm OK with in principal; is that
> > the only locking we need? (I guess the two directions are actually
> > separate RDMAContext's so maybe).
> 
> The qio_channel_rdma_close maybe invoked by migration thread and
> return path thread
> concurrently, so I use a mutex to protect it.

Hmm, that is not good - concurrent threads calling close must not be
allowed to happen even with non-RDMA I/O chanels.

For example, with the QIOChannelSocket, one thread can call close
which sets the fd = -1, another thread can race with this and either
end up calling close again on the same FD or calling close on -1.
Either way the second thread will get an error from close() when
it should have skipped the close() and returned success. Perhaps
migration gets lucky and this doesn't result in it being marked
as failed, but it is still not good.

So only one thread should be calling close().

> If one thread invoke qio_channel_rdma_writev, another thread invokes
> qio_channel_rdma_readv,
> two threads will use separate RDMAContext, so it does not need a lock.
> 
> If two threads invoke qio_channel_rdma_writev concurrently, it will
> need a lock to protect.
> but I find source qemu migration thread only invoke
> qio_channel_rdma_writev, the return path
> thread only invokes qio_channel_rdma_readv.

QIOChannel impls are only intended to cope with a single thread doing
I/O in each direction. If you have two threads needing to read, or
two threads needing to write, the layer above should provide locking
to ensure correct ordering  of I/O oprations.

> The destination qemu only invoked qio_channel_rdma_readv by main
> thread before postcopy and or
> listen thread after postcopy.
> 
> The destination qemu have already protected it by using
> qemu_mutex_lock(&mis->rp_mutex) when writing data to
> source qemu.
> 
> But should we use qemu_mutex_lock to protect qio_channel_rdma_writev
> and qio_channel_rdma_readv?
> to avoid some change in future invoke qio_channel_rdma_writev or
> qio_channel_rdma_readv concurrently?

> 
> >
> > But is there nothing else to make the QIOChannel bidirectional?
> >
> > Also, a lot seems dependent on listen_id, can you explain how that's
> > being used.
> 
> The destination qemu is server side, so listen_id is not zero. the
> source qemu is client side,
> the listen_id is zero.
> I use listen_id to determine whether qemu is destination or source.
> 
> for the destination qemu, if write data to source, it need use the
> return_path rdma, like this:
>     if (rdma->listen_id) {
>         rdma = rdma->return_path;
>     }
> 
> for the source qemu, if read data from destination, it also need use
> the return_path rdma.
>     if (!rdma->listen_id) {
>         rdma = rdma->return_path;
>     }

This feels uncessarily complex to me. Why not just change QIOCHannelRMDA
struct, so that it has 2 RDMA context pointers eg

struct QIOChannelRDMA {
    QIOChannel parent;
    RDMAContext *rdmain;
    RDMAContext *rdmaout;
    QEMUFile *file;
    bool blocking; /* XXX we don't actually honour this yet */
};



Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



reply via email to

[Prev in Thread] Current Thread [Next in Thread]