qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 5/7] block/mirror: utilize job_exit shim


From: Max Reitz
Subject: Re: [Qemu-devel] [PATCH 5/7] block/mirror: utilize job_exit shim
Date: Sat, 25 Aug 2018 17:14:16 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

On 2018-08-25 17:02, Max Reitz wrote:
> On 2018-08-23 00:05, John Snow wrote:
>>
>>
>> On 08/22/2018 08:15 AM, Max Reitz wrote:
>>> On 2018-08-17 21:04, John Snow wrote:
>>>> Change the manual deferment to mirror_exit into the implicit
>>>> callback to job_exit and the mirror_exit callback.
>>>>
>>>> This does change the order of some bdrv_unref calls and job_completed,
>>>> but thanks to the new context in which we call .job_exit, this is safe
>>>> to defer the possible flushing of any nodes to the job_finalize_single
>>>> cleanup stage.
>>>
>>> Ah, right, I forgot this.  Hm, what exactly do you mean?  This function
>>> is executed in the main loop, so it can make 'src' go away.  I don't see
>>> any difference to before.
>>>
>>
>> This changes the order in which we unreference these objects; if you
>> look at this patch the job_completed call I delete is in the middle of
>> what becomes the .exit() callback, which means there is a subtle change
>> in the ordering of how references are put down.
>>
>> Take a look at the weird ordering of mirror_exit as it exists right now;
>> we call job_completed first and *then* put down the last references. If
>> you re-order this upstream right now, you'll deadlock QEMU because this
>> means job_completed is responsible for putting down the last reference
>> to some of these block/bds objects.
>>
>> However, job_completed takes an additional AIO context lock and calls
>> job_finalize_single under *two* locks, which will hang QEMU if we
>> attempt to flush any of these nodes when we put down the last reference.
> 
> If you say so...  I have to admit I don't really understand.  The
> comment doesn't explain why it's so important to keep src around until
> job_completed(), so I don't know.  I thought AioContexts are recursive
> so it doesn't matter whether you take them recursively or not.
> 
> Anyway.  So the difference now is that job_defer_to_main_loop() took the
> lock around the whole exit function, whereas the new exit shim only
> takes it around the .exit() method, but calls job_complete() without a
> lock -- and then job_finalize_single() gets its lock again, so the job
> methods are again called with locks.  That sounds OK to me.
> 
>> Performing the reordering here is *safe* because by removing the call to
>> job_completed and utilizing the exit shim, the .exit() callback executes
>> only under one lock, and when the finalize code runs later it is also
>> executed under only one lock, making this re-ordering safe.
>>
>> Clear as mud?
> 
> Well, I trust you that the drain issue was the reason that src had to
> stay around until after job_completed().  It seems a bit
> counter-intuitive, because the comment explaining that src needs to stay
> around until job_completed() doesn't say much -- but it does imply that
> without that bdrv_ref(), the BDS might be destroyed before
> job_completed().  Which is different from simply having only one
> reference left and then being deleted in job_completed().
> 
> Looking at 3f09bfbc7be, I'm inclined to believe the original reason may
> be that src->job points to the job and that we shouldn't delete it as
> long as it does (bdrv_delete() asserts that bs->job is NULL).  Oh no, a
> tangent appears.
> 
> ...I would assume that when bdrv_replace_node() is called, BlockJob.blk
> is updated to point to the new BDS.  But nobody seems to update the
> BDS.job field.  Investigation is in order.

Aha!  Mirror is run from the mirror filter (of course).  So the node
where BDS.job is set is actually not replaced.

Well, it is, but then we specifically switch the job BB back to point to
the mirror filter.  As long as it does not go away until then, all is
safe.  (And that bdrv_ref() guard doesn't change with this patch.)

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]