qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] replay: synchronize on every virtual timer callback


From: Alex Bennée
Subject: Re: [PATCH] replay: synchronize on every virtual timer callback
Date: Mon, 18 May 2020 16:56:36 +0100
User-agent: mu4e 1.4.6; emacs 28.0.50

Philippe Mathieu-Daudé <address@hidden> writes:

> + Alex
>
> On 5/6/20 10:17 AM, Pavel Dovgalyuk wrote:
>> Sometimes virtual timer callbacks depend on order
>> of virtual timer processing and warping of virtual clock.
>> Therefore every callback should be logged to make replay deterministic.
>> This patch creates a checkpoint before every virtual timer callback.
>> With these checkpoints virtual timers processing and clock warping
>> events order is completely deterministic.
>> Signed-off-by: Pavel Dovgalyuk <address@hidden>
>> ---
>>   util/qemu-timer.c |    5 +++++
>>   1 file changed, 5 insertions(+)
>> diff --git a/util/qemu-timer.c b/util/qemu-timer.c
>> index d548d3c1ad..47833f338f 100644
>> --- a/util/qemu-timer.c
>> +++ b/util/qemu-timer.c
>> @@ -588,6 +588,11 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
>>           qemu_mutex_lock(&timer_list->active_timers_lock);
>>             progress = true;
>> +        /*
>> +         * Callback may insert new checkpoints, therefore add new checkpoint
>> +         * for the virtual timers.
>> +         */
>> +        need_replay_checkpoint = timer_list->clock->type == 
>> QEMU_CLOCK_VIRTUAL;
>>       }
>>       qemu_mutex_unlock(&timer_list->active_timers_lock);

So the problem I have with this as with all the record/replay stuff I
need want to review is it's very hard to see things in action. I added a
*very* basic record/replay test to the aarch64 softmmu tests but they
won't exercise any of this code because no timers get fired. I'm
assuming the sort of tests that is really needed is something that not
only causes QEMU_CLOCK_VIRTUAL timers to fire and trigger logged HW
events and ensure that things don't get confused in the process.

If I read up the file I just get more questions than answers. For
example why do we release the qemu_timers lock before processing the
replay event? Is it that the replay event could cause another timer to
be consumed? That seems suspect to me given we should only be expiring
times in the run loop.

Could the code be re-factored to use QEMU_LOCK_GUARD? It's hard to know
and I really wouldn't want to try that re-factoring without some sort of
confidence we were properly exercising the semantics of record/replay
and alive to potential regressions.

Please realise I do like the concept of record/replay and I'd love to
get more features merged (like for example the reverse debug patches).
However by it's very nature it gets it's fingers deeply intertwined with
the main run loop and we really need to better exercise the code in our
tests.

FWIW you can have an:

Acked-by: Alex Bennée <address@hidden>

which means it doesn't look obviously broken to me and it doesn't seem
to break the non-record/replay cases because that's all I can really
test.


-- 
Alex Bennée



reply via email to

[Prev in Thread] Current Thread [Next in Thread]