[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] [Qemu-devel] [PATCH v3] blockjob: drain all job nodes i
Re: [Qemu-block] [Qemu-devel] [PATCH v3] blockjob: drain all job nodes in block_job_drain
Fri, 2 Aug 2019 08:45:28 +0000
01.08.2019 22:44, Max Reitz wrote:
> On 30.07.19 21:11, John Snow wrote:
>> On 7/24/19 5:40 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> Instead of draining additional nodes in each job code, let's do it in
>>> common block_job_drain, draining just all job's children.
>>> BlockJobDriver.drain becomes unused, so, drop it at all.
>>> It's also a first step to finally get rid of blockjob->blk.
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
>>> v3: just resend, as I've some auto returned mails and not sure that
>>> v2 reached recipients.
>>> v2: apply Max's suggestions:
>>> - drop BlockJobDriver.drain
>>> - do firtly loop of bdrv_drained_begin and then separate loop
>>> of bdrv_drained_end.
>>> Hmm, a question here: should I call bdrv_drained_end in reverse
>>> order? Or it's OK as is?
>> I think it should be OK. These nodes don't necessarily have a well
>> defined relationship between each other, do they?
> It’s OK. If they do have a relationship, the drain code sorts it out by
> itself anyway.
>> Seems much nicer to me. What becomes of the ref/unref pairs?
>> I guess not needed anymore?, since job cleanup necessarily happens in
>> the main loop context now and we don't have a backup_complete function
>> anymore ...?
>> In the cases where auto_finalize=true, do we have any guarantee that the
>> completion callbacks cannot be scheduled while we are here?
> Let me try to figure this out...
> The only caller of block_job_drain() is job_drain(), whose only caller
> is job_finish_sync() in an AIO_WAIT_WHILE() loop. That loop can only
> work in the main loop, so I suppose it does run in the main loop (and
> consequentially, block_job_drain() runs in the main loop, too).
> That AIO_WAIT_WHILE() loop drains the job (so all nodes) and waits until
> the job has completed. To me that looks like it is designed to have the
> completion callback run at some point...?
> I suppose anything like that is scheduled by job_enter() in job_drain(),
> namely the aio_co_enter() in there.
> (A) If the job runs in the main AioContext, aio_co_enter() will enter
> the coroutine if we do not run in a coroutine right now (safe), or it
> will schedule it for a later point if we do run in a coroutine. That
> latter part may be unsafe, because I guess some coroutine work in
> bdrv_drained_begin() or bdrv_drained_end() may wake it up, which can
> then mess everything up.
> (B) If the job runs in another context, aio_co_enter() will schedule the
> job immediately, which I guess means that it can run at any point? So
> it could complete at any point, including block_job_drain(). Ah, no,
> actually. AIO_WAIT_WHILE() will have the job’s context acquired while
> evaluating the condition (that is, when calling block_job_drain()). So
> this is safe.
> So, well. Unless Vladimir can give you a guarantee why e.g.
> block_job_remove_all_bdrv() won’t run during e.g. bdrv_drained_begin(),
> I suppose you’re right and the node list needs to be copied at the
> beginning of this function and all nodes should be ref’d.
Thanks a lot! OK, I'll resend