qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 01/21] accel/tcg: Change interrupt/exception handling to r


From: Robert Foley
Subject: Re: [PATCH v1 01/21] accel/tcg: Change interrupt/exception handling to remove implied BQL
Date: Fri, 7 Aug 2020 18:18:08 -0400

We found cases where a few of the *_cpu_exec_interrupt per arch functions
call CPU class's cc->do_interrupt function pointer (for example
arm_cpu_exec_interrupt does this).

This is an issue because *_cpu_exec_interrupt will hold the BQL across the
call to cc->do_interrupt, and *_do_interrupt will also hold the BQL.

Most of the arches do not have this issue because they call the *_do_interrupt
function for that arch directly, and in those cases we will change to call
the function *_do_interrupt_locked.

We see a few possible solutions to this:

1) We could go back to the option of conditionalizing the BQL inside
   these few *_do_interrupt functions, only acquiring the BQL if it is not
   already held.  I counted 3 different arches that directly use the
->do_interrupt
   function from their *_cpu_exec_interrupt.

2) Another perhaps cleaner option is to add a new cpu class function
->do_interrupt_locked.
   This lets callers like *_cpu_exec_interrupt call to ->do_interrupt_locked
   with lock held and solves the issue without resorting to conditional locking.

   Another benefit we could gain from this approach is to simplify our solution
   overall by adding a common do_interrupt function.

   void cpu_common_do_interrupt(CPUState *cs)
   {
        CPUClass *cc = CPU_GET_CLASS(cpu);
        qemu_mutex_lock_iothread();
        cc->do_interrupt_locked(cpu);
        qemu_mutex_unlock_iothread();
    }
   cc->do_interrupt would be set to cpu_common_do_interrupt by default
in cpu_class_init.
   In other words, the base cpu class would handle holding the BQL for us,
   and we would not need to implement a new *_do_interrupt function
for each arch.

We are thinking that 2) would be a good option.

What are the opinions on these possible solutions?  Or are there other
solutions that we should consider here?

Thanks & Regards,
-Rob


On Thu, 6 Aug 2020 at 16:04, Robert Foley <robert.foley@linaro.org> wrote:
>
> The comment around documenting the cpu_mutex fields and critical sections
> got us thinking and revisiting our locking assumptions in 
> cpu_handle_interrupt.
>
> Initially we were thinking that removing the BQL from cpu_handle_interrupt
> meant that we needed to replace it with the cpu mutex to protect the per cpu
> data that is accessed like interrupt_request.  We are reconsidering this and
> now thinking that the cpu mutex might not be needed here.
>
> BQL is clearly needed to protect the critical section around the call to
> ->cpu_exec_interrupt.  What else is the BQL protecting in cpu_handle_interrupt
> that we need to consider?  Are we missing anything here?
>
> It's also worth mentioning that we did give this approach a try.
> We tried out changes to cpu_handle_interrupt(),
> dropping the BQL from all but around ->cpu_exec_interrupt, and not using the
> cpu_mutex at all.  It seemed to be functional, passing all the tests that
> we tried (so far). :)
>
> Thanks,
> -Rob
>
> On Thu, 6 Aug 2020 at 12:11, Robert Foley <robert.foley@linaro.org> wrote:
> >
> > On Thu, 6 Aug 2020 at 05:22, Paolo Bonzini <pbonzini@redhat.com> wrote:
> > >
> > > On 05/08/20 21:18, Richard Henderson wrote:
> > > > On 8/5/20 11:12 AM, Robert Foley wrote:
> > > >> This change removes the implied BQL from the cpu_handle_interrupt,
> > > >> and cpu_handle_exception paths. This BQL acquire is being pushed
> > > >> down into the per arch implementation.
> > > >>
> > > >> Signed-off-by: Robert Foley <robert.foley@linaro.org>
> > > >> ---
> > > >>  accel/tcg/cpu-exec.c | 19 +++++++++++--------
> > > >>  1 file changed, 11 insertions(+), 8 deletions(-)
> > > >>
> > > >> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> > > >> index 80d0e649b2..8e2bfd97a1 100644
> > > >> --- a/accel/tcg/cpu-exec.c
> > > >> +++ b/accel/tcg/cpu-exec.c
> > > >> @@ -517,9 +517,7 @@ static inline bool cpu_handle_exception(CPUState 
> > > >> *cpu, int *ret)
> > > >>  #else
> > > >>          if (replay_exception()) {
> > > >>              CPUClass *cc = CPU_GET_CLASS(cpu);
> > > >> -            qemu_mutex_lock_iothread();
> > > >>              cc->do_interrupt(cpu);
> > > >> -            qemu_mutex_unlock_iothread();
> > > >>              cpu->exception_index = -1;
> > > >>
> > > >
> > > > This patch is not bisectable.  The removal of the lock here needs to 
> > > > happen at
> > > > the end, or something.
> > >
> > > Indeed the series should be structured like this:
> > >
> > > 1) rename all *_do_interrupt functions to *_do_interrupt_locked
> > >
> > > 2) add back *_do_interrupt that takes the BQL and calls
> > > *_do_interrupt_locked, point ->do_interrupt to it, remove the BQL from
> > > cpu-exec.c
> > >
> > > 3) modify the cpu_mutex and BQL critical sections around
> > > ->cpu_exec_interrupt, so that the BQL critical section covers just the
> > > call to ->cpu_exec_interrupt.  Document which fields are now covered by
> > > cpu_mutex.
> > >
> > > 4/5) same as 1/2 for ->cpu_exec_interrupt
> > >
> > > Patches 1/2 would be pretty large, but they're trivial to review just by
> > > grepping for "->do_interrupt\s*=", and likewise for 4/5.
> > >
> >
> > Thanks for the details !
> >
> > It seems like we will have 3 separate patches for this series, 1/2, 3, and 
> > 4/5.
> >
> > We will go in this direction.
> >
> > Thanks,
> > -Rob
> >
> > > Thanks,
> > >
> > > Paolo
> > >



reply via email to

[Prev in Thread] Current Thread [Next in Thread]