[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Plugin Memory Callback Debugging
From: |
Aaron Lindsay |
Subject: |
Re: Plugin Memory Callback Debugging |
Date: |
Tue, 29 Nov 2022 15:37:51 -0500 |
On Nov 22 10:57, Aaron Lindsay wrote:
> On Nov 21 18:22, Richard Henderson wrote:
> > On 11/21/22 13:51, Alex Bennée wrote:
> > >
> > > Aaron Lindsay <aaron@os.amperecomputing.com> writes:
> > >
> > > > On Nov 15 22:36, Alex Bennée wrote:
> > > > > Aaron Lindsay <aaron@os.amperecomputing.com> writes:
> > > > > > I believe the code *should* always reset `cpu->plugin_mem_cbs` to
> > > > > > NULL at the
> > > > > > end of an instruction/TB's execution, so its not exactly clear to
> > > > > > me how this
> > > > > > is occurring. However, I suspect it may be relevant that we are
> > > > > > calling
> > > > > > `free_dyn_cb_arr()` because my plugin called `qemu_plugin_reset()`.
> > > > >
> > > > > Hmm I'm going to have to remind myself about how this bit works.
> > > >
> > > > When is it expected that cpu->plugin_mem_cbs is reset to NULL if it is
> > > > set for an instruction? Is it guaranteed it is reset by the end of the
> > > > tb?
> > >
> > > It should be by the end of the instruction. See
> > > inject_mem_disable_helper() which inserts TCG code to disable the
> > > helpers. We also have plugin_gen_disable_mem_helpers() which should
> > > catch every exit out of a block (exit_tb, goto_tb, goto_ptr). That is
> > > why qemu_plugin_disable_mem_helpers() is only really concerned about
> > > when we longjmp out of the loop.
> > >
> > > > If I were to put an assertion in cpu_tb_exec() just after the call
> > > > to tcg_qemu_tb_exec(), should cpu->plugin_mem_cbs always be NULL
> > > > there?
> > >
> > > Yes I think so.
> >
> > Indeed.
>
> Well, the good news is that if this is an assumption we're relying on, it is
> now trivial to reproduce the problem!
>
> Compile some simple program (doesn't really matter, the issue gets triggered
> early):
>
> $ echo "int main() { return 0; }" > simple.c && gcc simple.c -o simple
>
> Make this change to cpu_tb_exec():
>
> > diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> > index 356fe348de..50a010327d 100644
> > --- a/accel/tcg/cpu-exec.c
> > +++ b/accel/tcg/cpu-exec.c
> > @@ -436,6 +436,9 @@ cpu_tb_exec(CPUState *cpu, TranslationBlock *itb, int
> > *tb_exit)
> >
> > qemu_thread_jit_execute();
> > ret = tcg_qemu_tb_exec(env, tb_ptr);
> > + if (cpu->plugin_mem_cbs != NULL) {
> > + g_assert_not_reached();
> > + }
> > cpu->can_do_io = 1;
> > /*
> > * TODO: Delay swapping back to the read-write region of the TB
>
> And run:
>
> $ ./build/qemu-aarch64 -plugin contrib/plugins/libexeclog.so -d plugin
> ./simple
>
> You should fairly quickly see something like:
>
> > [snip]
> > 0, 0x5502814d04, 0xb4000082, ""
> > 0, 0x5502814d08, 0xf9400440, "", load, 0x5502844ed0
> > 0, 0x5502814d0c, 0xf1001c1f, ""
> > **
> > ERROR:../accel/tcg/cpu-exec.c:440:cpu_tb_exec: code should not be reached
> > Bail out! ERROR:../accel/tcg/cpu-exec.c:440:cpu_tb_exec: code should not be
> > reached
>
> When digging through my other failure in `rr` I saw the cpu->plugin_mem_cbs
> pointer changing from one non-null value to another (which also seems to
> indicate it is not being cleared between instructions).
>
> Does this hint that there are cases where reset cpu->plugin_mem_cbs to NULL is
> getting optimized away, but not the code to set it in the first place?
Is there anyone who could help take a look at this from the code gen
perspective?
-Aaron
Re: Plugin Memory Callback Debugging, Emilio Cota, 2022/11/16