qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/3] block: add missed aio_context_acquire aroun


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [PATCH 1/3] block: add missed aio_context_acquire around bdrv_set_aio_context
Date: Fri, 6 Nov 2015 14:55:32 +0000

On Fri, Nov 6, 2015 at 2:51 PM, Denis V. Lunev <address@hidden> wrote:
> On 11/06/2015 05:44 PM, Stefan Hajnoczi wrote:
>>
>> On Fri, Nov 6, 2015 at 2:09 PM, Denis V. Lunev <address@hidden> wrote:
>>>
>>> On 11/06/2015 04:35 PM, Stefan Hajnoczi wrote:
>>>>
>>>> On Wed, Nov 04, 2015 at 08:27:22PM +0300, Denis V. Lunev wrote:
>>>>>
>>>>> It is required for bdrv_drain.
>>>>
>>>> What bug does this patch fix?
>>>>
>>>> Existing blk_set_aio_context() callers acquire the AioContext or are
>>>> sure it's already acquired by their caller, so I don't see where the bug
>>>> is.
>>>>
>>>> No function in block/block-backend.c acquires AioContext internally.
>>>> The same reasoning I mentioned in another email thread about consistent
>>>> locking strategy applies here.  Either all blk_*() functions should take
>>>> the AioContext internally (which I disagree with) or none of them should
>>>> take it.
>>>
>>>
>>> #5  0x00005555559b31bf in bdrv_drain (bs=0x5555564274a0) at
>>> block/io.c:258
>>> #6  0x0000555555963321 in bdrv_set_aio_context (bs=0x5555564274a0,
>>>      new_context=0x55555640c100) at block.c:3764
>>> #7  0x00005555559a8b67 in blk_set_aio_context (blk=0x555556425d20,
>>>      new_context=0x55555640c100) at block/block-backend.c:1068
>>> #8  0x00005555556a4c52 in virtio_scsi_hotplug
>>> (hotplug_dev=0x5555579f22b0,
>>>      dev=0x555557bc8470, errp=0x7fffffffd9e0)
>>>      at /home/den/src/qemu/hw/scsi/virtio-scsi.c:773
>>> #9  0x00005555557eb3e2 in hotplug_handler_plug
>>> (plug_handler=0x5555579f22b0,
>>>      plugged_dev=0x555557bc8470, errp=0x7fffffffd9e0) at
>>> hw/core/hotplug.c:22
>>> #10 0x00005555557e7794 in device_set_realized (obj=0x555557bc8470,
>>> value=true,
>>>      errp=0x7fffffffdb90) at hw/core/qdev.c:1066
>>> #11 0x000055555595580b in property_set_bool (obj=0x555557bc8470,
>>>      v=0x555557b51bc0, opaque=0x555557bc8740, name=0x555555a43841
>>> "realized",
>>>      errp=0x7fffffffdb90) at qom/object.c:1708
>>> #12 0x0000555555953e2a in object_property_set (obj=0x555557bc8470,
>>>      v=0x555557b51bc0, name=0x555555a43841 "realized",
>>> errp=0x7fffffffdb90)
>>>      at qom/object.c:965
>>> #13 0x00005555559566c7 in object_property_set_qobject
>>> (obj=0x555557bc8470,
>>>      value=0x555557bc8370, name=0x555555a43841 "realized",
>>> errp=0x7fffffffdb90)
>>> ---Type <return> to continue, or q <return> to quit---
>>>      at qom/qom-qobject.c:24
>>> #14 0x00005555559540cd in object_property_set_bool (obj=0x555557bc8470,
>>>      value=true, name=0x555555a43841 "realized", errp=0x7fffffffdb90)
>>>      at qom/object.c:1034
>>> #15 0x0000555555760e47 in qdev_device_add (opts=0x5555563f90f0,
>>>      errp=0x7fffffffdc18) at qdev-monitor.c:589
>>> #16 0x0000555555772501 in device_init_func (opaque=0x0,
>>> opts=0x5555563f90f0,
>>>      errp=0x0) at vl.c:2305
>>> #17 0x0000555555a198cb in qemu_opts_foreach (
>>>      list=0x555555e96a60 <qemu_device_opts>,
>>>      func=0x5555557724c3 <device_init_func>, opaque=0x0, errp=0x0)
>>>      at util/qemu-option.c:1114
>>> #18 0x0000555555777b20 in main (argc=83, argv=0x7fffffffe058,
>>>      envp=0x7fffffffe2f8) at vl.c:4523
>>>
>>> You want the lock to be taken by the function. It would be
>>> quite natural to add assert(aio_context_is_owner(ctx)) in that
>>> case.
>>
>> This stack trace is fine since
>> hw/scsi/virtio-scsi.c:virtio_scsi_hotplug():
>>    aio_context_acquire(s->ctx);
>>    blk_set_aio_context(sd->conf.blk, s->ctx);
>>    aio_context_release(s->ctx);
>>
>> What bug does this patch fix?  I still don't see the problem.
>
>
> Program received signal SIGABRT, Aborted.
> 0x00007ffff3e8c267 in __GI_raise (address@hidden)
>     at ../sysdeps/unix/sysv/linux/raise.c:55
> 55    ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
> (gdb) bt
> #0  0x00007ffff3e8c267 in __GI_raise (address@hidden)
>     at ../sysdeps/unix/sysv/linux/raise.c:55
> #1  0x00007ffff3e8deca in __GI_abort () at abort.c:89
> #2  0x00007ffff3e8503d in __assert_fail_base (
>     fmt=0x7ffff3fe7028 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
>     address@hidden "aio_context_is_owner(ctx)",
>     address@hidden "aio-posix.c", address@hidden,
>     address@hidden <__PRETTY_FUNCTION__.21740>
> "aio_poll") at assert.c:92
> #3  0x00007ffff3e850f2 in __GI___assert_fail (
>     assertion=0x555555aab968 "aio_context_is_owner(ctx)",
>     file=0x555555aab94a "aio-posix.c", line=244,
>     function=0x555555aab9b0 <__PRETTY_FUNCTION__.21740> "aio_poll")
>     at assert.c:101

IIRC the main loop does not acquire/release its AioContext.  It relies
on qemu_global_mutex, which means that the AioContext rfifolock is not
used for the main loop AioContext.

The assertion is failing but the code is safe.

If you want to change this you need to figure out the
AioContext<->GSource integration used by the main loop and how to add
rfifolock to that.

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]