[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 06/23] migration/multifd: Separate SYNC request with norma
From: |
Fabiano Rosas |
Subject: |
Re: [PATCH v2 06/23] migration/multifd: Separate SYNC request with normal jobs |
Date: |
Fri, 02 Feb 2024 16:21:11 -0300 |
peterx@redhat.com writes:
> From: Peter Xu <peterx@redhat.com>
>
> Multifd provide a threaded model for processing jobs. On sender side,
> there can be two kinds of job: (1) a list of pages to send, or (2) a sync
> request.
>
> The sync request is a very special kind of job. It never contains a page
> array, but only a multifd packet telling the dest side to synchronize with
> sent pages.
>
> Before this patch, both requests use the pending_job field, no matter what
> the request is, it will boost pending_job, while multifd sender thread will
> decrement it after it finishes one job.
>
> However this should be racy, because SYNC is special in that it needs to
> set p->flags with MULTIFD_FLAG_SYNC, showing that this is a sync request.
> Consider a sequence of operations where:
>
> - migration thread enqueue a job to send some pages, pending_job++ (0->1)
>
> - [...before the selected multifd sender thread wakes up...]
>
> - migration thread enqueue another job to sync, pending_job++ (1->2),
> setup p->flags=MULTIFD_FLAG_SYNC
>
> - multifd sender thread wakes up, found pending_job==2
> - send the 1st packet with MULTIFD_FLAG_SYNC and list of pages
> - send the 2nd packet with flags==0 and no pages
>
> This is not expected, because MULTIFD_FLAG_SYNC should hopefully be done
> after all the pages are received. Meanwhile, the 2nd packet will be
> completely useless, which contains zero information.
>
> I didn't verify above, but I think this issue is still benign in that at
> least on the recv side we always receive pages before handling
> MULTIFD_FLAG_SYNC. However that's not always guaranteed and just tricky.
>
> One other reason I want to separate it is using p->flags to communicate
> between the two threads is also not clearly defined, it's very hard to read
> and understand why accessing p->flags is always safe; see the current impl
> of multifd_send_thread() where we tried to cache only p->flags. It doesn't
> need to be that complicated.
>
> This patch introduces pending_sync, a separate flag just to show that the
> requester needs a sync. Alongside, we remove the tricky caching of
> p->flags now because after this patch p->flags should only be used by
> multifd sender thread now, which will be crystal clear. So it is always
> thread safe to access p->flags.
>
> With that, we can also safely convert the pending_job into a boolean,
> because we don't support >1 pending jobs anyway.
>
> Always use atomic ops to access both flags to make sure no cache effect.
> When at it, drop the initial setting of "pending_job = 0" because it's
> always allocated using g_new0().
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
- [PATCH v2 02/23] migration/multifd: multifd_send_kick_main(), (continued)
[PATCH v2 06/23] migration/multifd: Separate SYNC request with normal jobs, peterx, 2024/02/02
- Re: [PATCH v2 06/23] migration/multifd: Separate SYNC request with normal jobs,
Fabiano Rosas <=
[PATCH v2 07/23] migration/multifd: Simplify locking in sender thread, peterx, 2024/02/02
[PATCH v2 08/23] migration/multifd: Drop pages->num check in sender thread, peterx, 2024/02/02
[PATCH v2 09/23] migration/multifd: Rename p->num_packets and clean it up, peterx, 2024/02/02
[PATCH v2 10/23] migration/multifd: Move total_normal_pages accounting, peterx, 2024/02/02
[PATCH v2 11/23] migration/multifd: Move trace_multifd_send|recv(), peterx, 2024/02/02
[PATCH v2 13/23] migration/multifd: Move header prepare/fill into send_prepare(), peterx, 2024/02/02
[PATCH v2 12/23] migration/multifd: multifd_send_prepare_header(), peterx, 2024/02/02
[PATCH v2 14/23] migration/multifd: Forbid spurious wakeups, peterx, 2024/02/02