qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] test-filter-mirror hangs


From: Markus Armbruster
Subject: Re: [Qemu-devel] test-filter-mirror hangs
Date: Fri, 25 Jan 2019 08:14:30 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)

Jason Wang <address@hidden> writes:

> On 2019/1/24 下午5:51, Peter Xu wrote:
>> On Thu, Jan 24, 2019 at 09:11:15AM +0000, Dr. David Alan Gilbert wrote:
>>> * Jason Wang (address@hidden) wrote:
>>>> On 2019/1/24 上午3:53, Dr. David Alan Gilbert wrote:
>>>>> * Jason Wang (address@hidden) wrote:
>>>>>> On 2019/1/22 上午2:56, Peter Maydell wrote:
>>>>>>> On Thu, 17 Jan 2019 at 09:46, Jason Wang<address@hidden>  wrote:
>>>>>>>> On 2019/1/15 上午12:33, Zhang Chen wrote:
>>>>>>>>> On Sat, Jan 12, 2019 at 12:15 AM Dr. David Alan Gilbert
>>>>>>>>> <address@hidden  <mailto:address@hidden>> wrote:
>>>>>>>>>
>>>>>>>>>        * Peter Maydell (address@hidden
>>>>>>>>>        <mailto:address@hidden>) wrote:
>>>>>>>>>        > Recently I've noticed that test-filter-mirror has been 
>>>>>>>>> hanging
>>>>>>>>>        > intermittently, typically when run on some other TCG 
>>>>>>>>> architecture.
>>>>>>>>>        > In the instance I've just looked at, this was with s390x 
>>>>>>>>> guest on
>>>>>>>>>        > x86-64 host, though I've also seen it on other host archs and
>>>>>>>>>        > perhaps with other guests.
>>>>>>>>>
>>>>>>>>>        Watch out to see if you really do see it for other guests;
>>>>>>>>>        it carefully avoids using virtio-net to avoid vhost; but on 
>>>>>>>>> s390x it
>>>>>>>>>        uses virtio-net-ccw - could that hit the vhost it was trying 
>>>>>>>>> to avoid?
>>>>>>>>>
>>>>>>>>>        > Below is a backtrace, though it seems to be pretty unhelpful.
>>>>>>>>>        > Anybody got any theories ? Does the mirror test rely on dirty
>>>>>>>>>        > memory bitmaps like the migration test (which also hangs
>>>>>>>>>        > occasionally with TCG due to some bug I'm sure we've 
>>>>>>>>> investigated
>>>>>>>>>        > in the past) ?
>>>>>>>>>
>>>>>>>>>        I don't think it relies on the CPU at all.
>>>>>>>>>     I have no idea about this currently, but Jason and I designed the
>>>>>>>>> test case.
>>>>>>>>> Add Jason: Have any comments about this ?
>>>>>>>> I can't reproduce this locally with s390x-softmmu. It looks to me the
>>>>>>>> test should be independent to any kinds of emulation. It should pass
>>>>>>>> when mainloop work.
>>>>>>> I've just seen a hang with ppc64 guest on s390x host, so it is
>>>>>>> indeed not specific to s390x guest (and so not specific to
>>>>>>> virtio-net either, since the ppc64 guest setup uses e1000).
>>>>>>>
>>>>>>> thanks
>>>>>>> -- PMM
>>>>>> Finally reproduced locally after hundreds (sometimes thousands) times of
>>>>>> running.
>>>>>>
>>>>>> Bisection points to OOB monitor[1].
>>>>>>
>>>>>> It looks to me after OOB is used unconditionally we lose a barrier to 
>>>>>> make
>>>>>> sure socket is connected before sending packets in test-filter-mirror.c. 
>>>>>> Is
>>>>>> there any other similar and simple thing that we could do to kick the
>>>>>> mainloop?
>>>>> Do you mean the:
>>>>>
>>>>>       /* send a qmp command to guarantee that 'connected' is setting to 
>>>>> true. */
>>>>>       qmp_discard_response(qts, "{ 'execute' : 'query-status'}");
>>>>
>>>> Yes.
>>>>
>>>>
>>>>> why was that ever sufficient to know the socket was ready?
>>>>
>>>> It was suggested by Fam, I don't remember the details. Can we make sure all
>>>> pending events has been processed (UNIX socket was set to connected) after
>>>> query-status is returned with an non OOB monitor?
>>> I'm not sure - it doesn't sound like a 'query-status' should ensure
>>> anything else.
>>> How about something like a 'query-chardev' - can that tell you what you
>>> need and loop until it's ready?
>> Yeah it sounds hacky to use "query status" to make sure a specific
>> chardev is connected even before the OOB...
>
>
> Probably, but anyway it works before OOB.

I don't doubt it worked.  Relying on inappropriate assumptions always
works just fine right until the assumptions become invalid :)

[...]



reply via email to

[Prev in Thread] Current Thread [Next in Thread]