qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Qemu-block] segfault in parallel blockjobs (iotest 30)


From: Anton Nefedov
Subject: Re: [Qemu-devel] [Qemu-block] segfault in parallel blockjobs (iotest 30)
Date: Wed, 8 Nov 2017 18:50:05 +0300
User-agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0



On 8/11/2017 5:45 PM, Alberto Garcia wrote:
On Tue 07 Nov 2017 05:19:41 PM CET, Anton Nefedov wrote:
BlockBackend gets deleted by another job's stream_complete(), deferred
to the main loop, so the fact that the job is put to sleep by
bdrv_drain_all_begin() doesn't really stop it from execution.

I was debugging this a bit, and the block_job_defer_to_main_loop() call
happens _after_ all jobs have been paused, so I think that when the BDS
is drained then stream_run() finishes the last iteration without
checking if it's paused.

Without your patch (i.e. with a smaller STREAM_BUFFER_SIZE) then I
assume that the function would have to continue looping and
block_job_sleep_ns() would make the job coroutine yield, effectively
pausing the job and preventing the crash.

I can fix the crash by adding block_job_pause_point(&s->common) at the
end of stream_run() (where the 'out' label is).

I'm thinking that perhaps we should add the pause point directly to
block_job_defer_to_main_loop(), to prevent any block job from running
the exit function when it's paused.


Is it possible that the exit function is already deferred when the jobs
are being paused? (even though it's at least less likely to happen)

Then should we flush the bottom halves somehow in addition to putting
the jobs to sleep? And also then it all probably has to happen before
bdrv_reopen_queue()

/Anton

Somehow I had the impression that we discussed this already in the past
(?) because I remember thinking about this very scenario.

Berto




reply via email to

[Prev in Thread] Current Thread [Next in Thread]