[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v8 5/5] multifd: Implement zero copy write in multifd migrati
From: |
Leonardo Bras Soares Passos |
Subject: |
Re: [PATCH v8 5/5] multifd: Implement zero copy write in multifd migration (multifd-zero-copy) |
Date: |
Tue, 22 Feb 2022 01:09:30 -0300 |
On Mon, Feb 21, 2022 at 4:41 PM Leonardo Bras Soares Passos
<leobras@redhat.com> wrote:
>
> Hello Juan, thanks for the feedback!
>
> On Fri, Feb 18, 2022 at 1:57 PM Juan Quintela <quintela@redhat.com> wrote:
> >
> > Leonardo Bras <leobras@redhat.com> wrote:
> > > Implement zero copy send on nocomp_send_write(), by making use of
> > > QIOChannel
> > > writev + flags & flush interface.
> > >
> > > Change multifd_send_sync_main() so flush_zero_copy() can be called
> > > after each iteration in order to make sure all dirty pages are sent before
> > > a new iteration is started. It will also flush at the beginning and at the
> > > end of migration.
> > >
> > > Also make it return -1 if flush_zero_copy() fails, in order to cancel
> > > the migration process, and avoid resuming the guest in the target host
> > > without receiving all current RAM.
> > >
> > > This will work fine on RAM migration because the RAM pages are not
> > > usually freed,
> > > and there is no problem on changing the pages content between
> > > writev_zero_copy() and
> > > the actual sending of the buffer, because this change will dirty the page
> > > and
> > > cause it to be re-sent on a next iteration anyway.
> > >
> > > A lot of locked memory may be needed in order to use multid migration
> > ^^^^^^
> > multifd.
> >
> > I can fix it on the commit.
>
> No worries, fixed for v9.
>
> >
> >
> > > @@ -1479,7 +1479,16 @@ static bool
> > > migrate_params_check(MigrationParameters *params, Error **errp)
> > > error_prepend(errp, "Invalid mapping given for
> > > block-bitmap-mapping: ");
> > > return false;
> > > }
> > > -
> > > +#ifdef CONFIG_LINUX
> > > + if (params->zero_copy_send &&
> > > + (!migrate_use_multifd() ||
> > > + params->multifd_compression != MULTIFD_COMPRESSION_NONE ||
> > > + (params->tls_creds && *params->tls_creds))) {
> > > + error_setg(errp,
> > > + "Zero copy only available for non-compressed non-TLS
> > > multifd migration");
> > > + return false;
> > > + }
> > > +#endif
> > > return true;
> > > }
> >
> > Test is long, but it is exactly what we need. Good.
>
> Thanks!
>
>
> >
> >
> > >
> > > diff --git a/migration/multifd.c b/migration/multifd.c
> > > index 43998ad117..2d68b9cf4f 100644
> > > --- a/migration/multifd.c
> > > +++ b/migration/multifd.c
> > > @@ -568,19 +568,28 @@ void multifd_save_cleanup(void)
> > > multifd_send_state = NULL;
> > > }
> > >
> > > -void multifd_send_sync_main(QEMUFile *f)
> > > +int multifd_send_sync_main(QEMUFile *f)
> > > {
> > > int i;
> > > + bool flush_zero_copy;
> > >
> > > if (!migrate_use_multifd()) {
> > > - return;
> > > + return 0;
> > > }
> > > if (multifd_send_state->pages->num) {
> > > if (multifd_send_pages(f) < 0) {
> > > error_report("%s: multifd_send_pages fail", __func__);
> > > - return;
> > > + return 0;
> > > }
> > > }
> > > +
> > > + /*
> > > + * When using zero-copy, it's necessary to flush after each
> > > iteration to
> > > + * make sure pages from earlier iterations don't end up replacing
> > > newer
> > > + * pages.
> > > + */
> > > + flush_zero_copy = migrate_use_zero_copy_send();
> > > +
> > > for (i = 0; i < migrate_multifd_channels(); i++) {
> > > MultiFDSendParams *p = &multifd_send_state->params[i];
> > >
> > > @@ -591,7 +600,7 @@ void multifd_send_sync_main(QEMUFile *f)
> > > if (p->quit) {
> > > error_report("%s: channel %d has already quit", __func__, i);
> > > qemu_mutex_unlock(&p->mutex);
> > > - return;
> > > + return 0;
> > > }
> > >
> > > p->packet_num = multifd_send_state->packet_num++;
> > > @@ -602,6 +611,17 @@ void multifd_send_sync_main(QEMUFile *f)
> > > ram_counters.transferred += p->packet_len;
> > > qemu_mutex_unlock(&p->mutex);
> > > qemu_sem_post(&p->sem);
> > > +
> > > + if (flush_zero_copy) {
> > > + int ret;
> > > + Error *err = NULL;
> > > +
> > > + ret = qio_channel_flush(p->c, &err);
> > > + if (ret < 0) {
> > > + error_report_err(err);
> > > + return -1;
> > > + }
> > > + }
> > > }
> > > for (i = 0; i < migrate_multifd_channels(); i++) {
> > > MultiFDSendParams *p = &multifd_send_state->params[i];
> > > @@ -610,6 +630,8 @@ void multifd_send_sync_main(QEMUFile *f)
> > > qemu_sem_wait(&p->sem_sync);
> > > }
> > > trace_multifd_send_sync_main(multifd_send_state->packet_num);
> > > +
> > > + return 0;
> > > }
> >
> > We are leaving pages is flight for potentially a lot of time. I *think*
> > that we can sync shorter than that.
> >
> > > static void *multifd_send_thread(void *opaque)
> > > @@ -668,8 +690,8 @@ static void *multifd_send_thread(void *opaque)
> > > p->iov[0].iov_len = p->packet_len;
> > > p->iov[0].iov_base = p->packet;
> > >
> > > - ret = qio_channel_writev_all(p->c, p->iov, p->iovs_num,
> > > - &local_err);
> > > + ret = qio_channel_writev_full_all(p->c, p->iov, p->iovs_num,
> > > NULL,
> > > + 0, p->write_flags,
> > > &local_err);
> > > if (ret != 0) {
> > > break;
> > > }
> >
> > I still think that it would be better to have a sync here in each
> > thread. I don't know the size, but once every couple megabytes of RAM
> > or so.
>
> This seems a good idea, since the first iteration may take a while,
> and we may take a lot of time to fail if something goes wrong with
> zerocopy at the start of iteration 1.
>
> On the other hand, flushing takes some time, and doing it a lot may
> take away some of the performance improvements.
>
> Maybe we could move the flushing to a thread of it's own, if it
> becomes a problem.
>
>
> >
> > I did a change on:
> >
> > commit d48c3a044537689866fe44e65d24c7d39a68868a
> > Author: Juan Quintela <quintela@redhat.com>
> > Date: Fri Nov 19 15:35:58 2021 +0100
> >
> > multifd: Use a single writev on the send side
> >
> > Until now, we wrote the packet header with write(), and the rest of the
> > pages with writev(). Just increase the size of the iovec and do a
> > single writev().
> >
> > Signed-off-by: Juan Quintela <quintela@redhat.com>
> > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> >
> > And now we need to "perserve" this header until we do the sync,
> > otherwise we are overwritting it with other things.
>
> Yeah, that is a problem I faced on non-multifd migration, and a reason
> why I choose to implement directly in multifd.
>
> >
> > What testing have you done after this commit?
>
> Not much, but this will probably become an issue with bigger guests.
>
> >
> > Notice that it is not _complicated_ to fix it, I will try to come with
> > some idea on monday, but basically is having an array of buffers for
> > each thread, and store them until we call a sync().
>
> That will probably work, we can use MultiFDRecvParams->packet as this
> array, right?
> We can start with a somehow small buffer size and grow if it ever gets full.
> (Full means a sync + g_try_malloc0 + g_free)
Or maybe use the array size as a size limiter before flushing, so it
will always flush after BUFFSIZE headers are sent.
>
> By my calculations, it should take 1,8kB each element on a 4k PAGESIZE.
>
> What do you think?
>
> Best regards,
> Leo
[PATCH v8 3/5] migration: Add zero-copy-send parameter for QMP/HMP for Linux, Leonardo Bras, 2022/02/01