[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(
From: |
Dr. David Alan Gilbert |
Subject: |
Re: [Qemu-devel] [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service |
Date: |
Fri, 29 May 2015 16:12:56 +0100 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
* Wen Congyang (address@hidden) wrote:
> On 05/29/2015 04:42 PM, Dr. David Alan Gilbert wrote:
> > * zhanghailiang (address@hidden) wrote:
> >> On 2015/5/29 9:29, Wen Congyang wrote:
> >>> On 05/29/2015 12:24 AM, Dr. David Alan Gilbert wrote:
> >>>> * zhanghailiang (address@hidden) wrote:
<snip>
> >>>> The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and
> >>>> secondary
> >>>> after the qemu quits; the backtrace of the qemu stack is:
> >>>
> >>> How to reproduce it? Use monitor command quit to quit qemu? Or kill the
> >>> qemu?
> >>>
> >>>>
> >>>> [<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
> >>>> [<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
> >>>> [<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
> >>>> [<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
> >>>> [<ffffffff81090c96>] notifier_call_chain+0x66/0x90
> >>>> [<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
> >>>> [<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
> >>>> [<ffffffff815878bf>] sock_release+0x1f/0x90
> >>>> [<ffffffff81587942>] sock_close+0x12/0x20
> >>>> [<ffffffff812193c3>] __fput+0xd3/0x210
> >>>> [<ffffffff8121954e>] ____fput+0xe/0x10
> >>>> [<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
> >>>> [<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
> >>>> [<ffffffff81722b66>] int_signal+0x12/0x17
> >>>> [<ffffffffffffffff>] 0xffffffffffffffff
> >>>
> >>> Thanks for your test. The backtrace is very useful, and we will fix it
> >>> soon.
> >>>
> >>
> >> Yes, it is a bug, the callback function colonl_close_event() is called
> >> when holding
> >> rcu lock:
> >> netlink_release
> >> ->atomic_notifier_call_chain
> >> ->rcu_read_lock();
> >> ->notifier_call_chain
> >> ->ret = nb->notifier_call(nb, val, v);
> >> And here it is wrong to call synchronize_rcu which will lead to sleep.
> >> Besides, there is another function might lead to sleep, kthread_stop which
> >> is called
> >> in destroy_notify_cb.
> >>
> >>>>
> >>>> that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from
> >>>> yesterday and
> >>>> older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
> >>>> I'm not sure of the right fix; perhaps it might be possible to replace
> >>>> the
> >>>> synchronize_rcu in colo_node_release by a call_rcu that does the kfree
> >>>> later?
> >>>
> >>> I agree with it.
> >>
> >> That is a good solution, i will fix both of the above problems.
> >
> > Thanks,
>
> We have fix this problem, and test it. The patch is pushed to github, please
> try it.
Yes, that works. Thank you very much for the quick fix.
Dave
>
> Thanks
> Wen Congyang
>
> >
> > Dave
> >
> >>
> >> Thanks,
> >> zhanghailiang
> >>
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Dave
> >>>>
> >>>>>
> >>>
> >>>
> >>> .
> >>>
> >>
> >>
> > --
> > Dr. David Alan Gilbert / address@hidden / Manchester, UK
> > --
> > To unsubscribe from this list: send the line "unsubscribe netfilter-devel"
> > in
> > the body of a message to address@hidden
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > .
> >
>
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK
- [Qemu-devel] [PATCH COLO-Frame v5 07/29] COLO: Add a new RunState RUN_STATE_COLO, (continued)
- [Qemu-devel] [PATCH COLO-Frame v5 07/29] COLO: Add a new RunState RUN_STATE_COLO, zhanghailiang, 2015/05/21
- [Qemu-devel] [PATCH COLO-Frame v5 22/29] COLO: Handle nfnetlink message from proxy module, zhanghailiang, 2015/05/21
- [Qemu-devel] [PATCH COLO-Frame v5 17/29] COLO: Add new command parameter 'colo_nicname' 'colo_script' for net, zhanghailiang, 2015/05/21
- Re: [Qemu-devel] [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service, Dr. David Alan Gilbert, 2015/05/21
- Re: [Qemu-devel] [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service, Dr. David Alan Gilbert, 2015/05/28