qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-arm] [PATCH v8 23/25] target-arm: introduce ARM_CP_EXIT_PC


From: Alex Bennée
Subject: Re: [Qemu-arm] [PATCH v8 23/25] target-arm: introduce ARM_CP_EXIT_PC
Date: Thu, 02 Feb 2017 11:03:24 +0000
User-agent: mu4e 0.9.19; emacs 25.1.91.6

Peter Maydell <address@hidden> writes:

> On 27 January 2017 at 10:39, Alex Bennée <address@hidden> wrote:
>> Some helpers may trigger an immediate exit of the cpu_loop. If this
>> happens the PC need to be rectified to ensure the restart will begin
>> on the next instruction.
>>
>> Signed-off-by: Alex Bennée <address@hidden>
>> ---
>>  target/arm/cpu.h           | 3 ++-
>>  target/arm/translate-a64.c | 4 ++++
>>  target/arm/translate.c     | 4 ++++
>>  3 files changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
>> index f56a96c675..1b0670ae11 100644
>> --- a/target/arm/cpu.h
>> +++ b/target/arm/cpu.h
>> @@ -1411,7 +1411,8 @@ static inline uint64_t cpreg_to_kvm_id(uint32_t 
>> cpregid)
>>  #define ARM_CP_NZCV            (ARM_CP_SPECIAL | (3 << 8))
>>  #define ARM_CP_CURRENTEL       (ARM_CP_SPECIAL | (4 << 8))
>>  #define ARM_CP_DC_ZVA          (ARM_CP_SPECIAL | (5 << 8))
>> -#define ARM_LAST_SPECIAL       ARM_CP_DC_ZVA
>> +#define ARM_CP_EXIT_PC         (ARM_CP_SPECIAL | (6 << 8))
>> +#define ARM_LAST_SPECIAL       ARM_CP_EXIT_PC
>
> There's a comment above this list of defines that documents
> what all the flags mean; can you add an entry to it for the
> new flag?

Sure - given your comment bellow maybe AP_CP_SYNC_REGS is a better name?

>
>>  /* Used only as a terminator for ARMCPRegInfo lists */
>>  #define ARM_CP_SENTINEL 0xffff
>>  /* Mask of only the flag bits in a type field */
>> diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
>> index 05162f335e..a3f37d8bec 100644
>> --- a/target/arm/translate-a64.c
>> +++ b/target/arm/translate-a64.c
>> @@ -1561,6 +1561,10 @@ static void handle_sys(DisasContext *s, uint32_t 
>> insn, bool isread,
>>          tcg_rt = cpu_reg(s, rt);
>>          gen_helper_dc_zva(cpu_env, tcg_rt);
>>          return;
>> +    case ARM_CP_EXIT_PC:
>> +        /* The helper may exit the cpu_loop so ensure PC is correct */
>> +        gen_a64_set_pc_im(s->pc);
>> +        break;
>
> This will work, but it's a little odd because it breaks the
> existing invariant that cp helpers never throw exceptions
> (except in the access function).

We don't throw an exception but we do exit the CPU loop which has work
waiting for it.

>
> Does single-stepping (of the emulated architectural
> debug step, and gdbstub singlestep) work across one of
> these instructions?

I'll have to test but I don't see why not. The instruction is fully
executed we just ensure we have exited the run loop to process the flush
before we get to the next instruction/

> Should we also set dc->is_jmp to force ending the TB here?

Probably - there is no reason to keep translating as the next
instruction will be in its own block.

> This is probably a question answered in the rest of the series,
> but why do we need the helper to be able to longjump out to the
> top level? Can't we just have the helper do its work and then
> end the TB with tcg_gen_exit_tb(0) so we return to the top level
> loop in the normal way?

Well I guess this is a philosophical question. The cputlb API is
offering the guarantee that when an *_all_cpus_synced() flush is done
everything will be complete with respect to all vCPUS. This is reliant
on the source vCPU executing an exclusive safe work which ensures all
other vCPUs have halted and therefor will have run their safe work
before returning to execution.

If ARM wanted to it could call the *_all_cpus() variant, schedule its
own exclusive safe work (a null function - as cputlb will have scheduled
the flush) and exit the TB in the usual way. In fact this is the
mechanism ARM could use if it wanted to defer the sync point to a later
DMB instruction.

I haven't implemented it yet as the flush stuff only comes up high in
the perf runs with my aggressive TLB flush microbenchmarks.

However I'm wary of having a _synched() variant which will only work
correctly if the guest also does a bunch of other steps.

>
>>      default:
>>          break;
>>      }
>> diff --git a/target/arm/translate.c b/target/arm/translate.c
>> index 444a24c2b6..7bd18cd25d 100644
>> --- a/target/arm/translate.c
>> +++ b/target/arm/translate.c
>> @@ -7508,6 +7508,10 @@ static int disas_coproc_insn(DisasContext *s, 
>> uint32_t insn)
>>              gen_set_pc_im(s, s->pc);
>>              s->is_jmp = DISAS_WFI;
>>              return 0;
>> +        case ARM_CP_EXIT_PC:
>> +            /* The helper may exit the cpu_loop so ensure PC is correct */
>> +            gen_set_pc_im(s, s->pc);
>> +            break;
>
> Do we also need to gen_set_condexec() ?

Do we? This isn't an exception so we don't need to resolve the condition
flags as long as there is enough information preserved so the next TB
can resolve if it needs to.

I'm afraid my knowledge of the condition code is "it is deferred until
it is needed as it is an expensive calculation". I'll dig into the
implementation.

>
>>          default:
>>              break;
>>          }
>> --
>> 2.11.0
>
> thanks
> -- PMM


--
Alex Bennée



reply via email to

[Prev in Thread] Current Thread [Next in Thread]