qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH V3 10/11] vcpu: introduce lockmap


From: liu ping fan
Subject: Re: [Qemu-devel] [PATCH V3 10/11] vcpu: introduce lockmap
Date: Thu, 13 Sep 2012 14:55:12 +0800

On Tue, Sep 11, 2012 at 10:54 PM, Marcelo Tosatti <address@hidden> wrote:
> On Tue, Sep 11, 2012 at 03:41:15PM +0300, Avi Kivity wrote:
>> On 09/11/2012 03:24 PM, Avi Kivity wrote:
>> > On 09/11/2012 12:57 PM, Jan Kiszka wrote:
>> >> On 2012-09-11 11:44, liu ping fan wrote:
>> >>> On Tue, Sep 11, 2012 at 4:35 PM, Avi Kivity <address@hidden> wrote:
>> >>>> On 09/11/2012 10:51 AM, Liu Ping Fan wrote:
>> >>>>> From: Liu Ping Fan <address@hidden>
>> >>>>>
>> >>>>> The func call chain can suffer from recursively hold
>> >>>>> qemu_mutex_lock_iothread. We introduce lockmap to record the
>> >>>>> lock depth.
>> >>>>
>> >>>> What is the root cause?  io handlers initiating I/O?
>> >>>>
>> >>> cpu_physical_memory_rw() can be called nested, and when called, it can
>> >>> be protected by no-lock/device lock/ big-lock.
>> >>> I think without big lock, io-dispatcher should face the same issue.
>> >>> As to main-loop, have not carefully consider, but at least, dma-helper
>> >>> will call cpu_physical_memory_rw() with big-lock.
>> >>
>> >> That is our core problem: inconsistent invocation of existing services
>> >> /wrt locking. For portio, I was lucky that there is no nesting and I was
>> >> able to drop the big lock around all (x86) call sites. But MMIO is way
>> >> more tricky due to DMA nesting.
>> >
>> > Maybe we need to switch to a continuation style.  Instead of expecting
>> > cpu_physical_memory_rw() to complete synchronously, it becomes an
>> > asynchronous call and you provide it with a completion.  That means
>> > devices which use it are forced to drop the lock in between.  Block and
>> > network clients will be easy to convert since they already use APIs that
>> > drop the lock (except for accessing the descriptors).
>> >
>> >> We could try to introduce a different version of cpu_physical_memory_rw,
>> >> cpu_physical_memory_rw_unlocked. But the problem remains that an MMIO
>> >> request can trigger the very same access in a nested fashion, and we
>> >> will have to detect this to avoid locking up QEMU (locking up the guest
>> >> might be OK).
>> >
>> > An async version of c_p_m_rw() will just cause a completion to bounce
>> > around, consuming cpu but not deadlocking anything.  If we can keep a
>> > count of the bounces, we might be able to stall it indefinitely or at
>> > least ratelimit it.
>> >
>>
>> Another option is to require all users of c_p_m_rw() and related to use
>> a coroutine or thread.  That makes the programming easier (but still
>> required a revalidation after the dropped lock).
>
> Unless i am misunderstanding this thread, the iothread flow section of
> http://lists.gnu.org/archive/html/qemu-devel/2012-06/msg04315.html
> contains possible solutions for this.
>
> Basically use trylock() and if it fails, then choose a bailout option
> (which there are more than one possibilities listed).
>
I think in that thread, the original goal of the trylock() is to solve
the lock held by other thread, by here we want to resolve nested lock.
But yes, there is similar point to solve it by async if we do not
allow nested lock.

Regards,
pingfan
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]