[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix do_kvm_cpu_synchronize_sta
From: |
Bhushan Bharat-R65777 |
Subject: |
Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix do_kvm_cpu_synchronize_state data integrity issue |
Date: |
Fri, 4 Jan 2013 08:57:29 +0000 |
> -----Original Message-----
> From: Alexander Graf [mailto:address@hidden
> Sent: Friday, January 04, 2013 2:11 PM
> To: Bhushan Bharat-R65777
> Cc: Marcelo Tosatti; address@hidden; Christian Borntraeger; Anthony
> Liguori; address@hidden qemu-devel
> Subject: Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix
> do_kvm_cpu_synchronize_state data integrity issue
>
>
> On 04.01.2013, at 05:22, Bhushan Bharat-R65777 wrote:
>
> >
> >
> >> -----Original Message-----
> >> From: Marcelo Tosatti [mailto:address@hidden
> >> Sent: Friday, January 04, 2013 7:08 AM
> >> To: Alexander Graf
> >> Cc: address@hidden; Christian Borntraeger; Anthony
> >> Liguori; qemu- address@hidden qemu-devel; Bhushan Bharat-R65777
> >> Subject: Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix
> >> do_kvm_cpu_synchronize_state data integrity issue
> >>
> >> On Thu, Jan 03, 2013 at 08:09:22PM +0100, Alexander Graf wrote:
> >>>
> >>> On 03.01.2013, at 19:48, Jason J. Herne wrote:
> >>>
> >>>> On 01/03/2013 08:56 AM, Alexander Graf wrote:
> >>>>>> static void do_kvm_cpu_synchronize_state(void *_args)
> >>>>>>> {
> >>>>>>> struct kvm_cpu_syncstate_args *args = _args;
> >>>>>>> + CPUArchState *env = args->env;
> >>>>>>> + int register_level = args->register_level;
> >>>>>>>
> >>>>> This probably becomes more readable if we explicitly revert back
> >>>>> to
> >> unsynced state first:
> >>>>>
> >>>>> /* Write back local modifications at our current level */ if
> >>>>> (register_level > env->kvm_vcpu_dirty) {
> >>>>> kvm_arch_put_registers(...);
> >>>>> env->kvm_vcpu_dirty = 0;
> >>>>> }
> >>>>>
> >>>>> and then do the sync we are requested to do:
> >>>>>
> >>>>> if (!env->kvm_vcpu_dirty) {
> >>>>> ...
> >>>>> }
> >>>>
> >>>> I agree, but only if we add a second conditional to the if 1st
> >>>> statement as
> >> such:
> >>>>
> >>>> if (args->env->kvm_vcpu_dirty && register_level >
> >>>> env->kvm_vcpu_dirty)
> >>>>
> >>>> This is to cover the case where the caller is asking for register level
> >>>> "1"
> >> and we're already dirty at level "2". In this case, nothing should
> >> happen and we'll need the "args->env->kvm_vcpu_dirty" to ensure that is the
> case.
> >>>
> >>> As before, I'd prefer to make this explicit:
> >>>
> >>>>
> >>>> static void do_kvm_cpu_synchronize_state(void *_args) {
> >>>> struct kvm_cpu_syncstate_args *args = _args;
> >>>> CPUArchState *env = args->env;
> >>>> int register_level = args->register_level;
> >>>
> >>> if (register_level > env->kvm_vcpu_dirty) {
> >>> /* We are more dirty than we need to - all is well */
> >>> return;
> >>> }
> >>>
> >>>>
> >>>> /* Write back local modifications at our current level */
> >>>> if (args->env->kvm_vcpu_dirty && register_level > env->kvm_vcpu_dirty)
> >>>> {
> >>>> kvm_arch_put_registers(env, env->kvm_vcpu_dirty);
> >>>> env->kvm_vcpu_dirty = 0;
> >>>> }
> >>>>
> >>>> if (!args->env->kvm_vcpu_dirty) {
> >>>> kvm_arch_get_registers(env, register_level);
> >>>> env->kvm_vcpu_dirty = register_level;
> >>>> }
> >>>> }
> >>>>
> >>>> Do you agree? Thanks for your time. :)
> >>>
> >>> Please also check out the discussions I've had with Bharat about his
> >>> watchdog
> >> patches. There we need a mechanism to synchronize registers only when
> >> we actually need to, in order to avoid potential race conditions with
> >> a kernel timer.
> >>>
> >>> That particular case doesn't work well with levels. We can have
> >>> multiple
> >> different potential race producers in the kernel that we need to
> >> avoid individually, so we can't always synchronize all of them when
> >> only one of them needs to be synchronized.
> >>>
> >>> The big question is what we should be doing about this. We basically
> >>> have 3
> >> options:
> >>>
> >>> * implement levels, treat racy registers as "manually
> >>> synchronized", as
> >> Bharat's latest patch set does
> >>> * implement levels, add a bitmap for additional special
> >>> synchronization bits
> >>> * replace levels by bitmap
> >>>
> >>> I'm quite frankly not sure which one of the 3 would be the best way
> >>> forward.
> >>>
> >>>
> >>> Alex
> >>
> >> Implement read/write accessors for individual registers, similarly to
> >> how its done in the kernel. See kvm_cache_regs.h.
> >
> > Read/write for individual registers can be heavy for cases where multiple
> register needed to be updated. Is not it?
>
> It depends. We can still have classes of registers to synchronize that we
> could
> even synchronize using MANY_REG. For writes, things become even faster with
> individual accessors since we can easily coalesce all register writes until we
> hit the vcpu again into a MANY_REG array and write them in one go.
>
> >
> > For cases where multiple register needed to be synchronized then I would
> > like
> to elaborate the options as:
> >
> > Option 1:
> > int timer_func(CPUArchState env)
> > {
> > cpu_synchronize_state(env);
> > //update env->timer_registers
> > env->updated_regs_type |= TIMER_REGS_TYPE_BIT;
>
> To scale this one, it's probably also more clever to swap the logic:
>
> env->kvm_sync_extra |= SYNC_TIMER_REGS;
> cpu_synchronize_state(env); /* gets timer regs */
> /* update env->timer_registers */
> /* timer regs will be synchronized later because kvm_sync_extra has
> SYNC_TIMER_REGS set */
>
> Your case right now is special in that we don't need to get any register
> state,
> but only write it. However, if we do
>
> cpu_synchronize_state(env); /* will not get timer regs */
> env->kvm_sync_extra |= SYNC_TIMER_REGS;
>
> then the next
>
> cpu_synchronize_state(env);
Currently the cpu_sychronize_state() will fetch registers only if "
!env->kvm_vcpu_dirty " . and on first cpu_synchromize_state it is set dirty. I
want same logic to continue with under this option.
Once the registered, the rest of code can keep on changing registers and
setting respective bitmap in env->update_reg_type. Then finally on
kvm_arch_put_register() will check all bits in env->update_reg_type for
SET_SREGS ioctl.
-Bharat
>
> would overwrite the timer values again. So we would also need to indicate that
> the bits are dirty:
>
> cpu_synchronize_state(env); /* will not get timer regs */
> env->kvm_sync_extra |= SYNC_TIMER_REGS;
> env->kvm_sync_dirty |= SYNC_TIMER_REGS;
>
> which cpu_synchronize_state() could use to make sure it doesn't read back any
> timer regs again. It would obviously also have to set it in its normal
> operation
> when it executed first ;).
>
> > }
> >
> > Change the kvm_arch_put_registers() to add followings:
> >
> > int kvm_arch_put_registers(CPUArchState *env, int level) {
> >
> > If (env->updated_regs_type) {
> > struct kvm_sregs sregs;
> > If (env->updated_regs_type & TIMER_REGS_TYPE_BIT) {
> > // update sregs->timer_registers;
> > // may be we can have a bitmap to tell kernel for what
> actually updated
> > }
> > If (env->updated_regs_type & XX_CPU_REGS_TYPE) {
> > // update respective registers in sregs
> > }
> >
> > ioctl(env, KVM_GET_SREGS, &sregs);
> > }
> > }
> >
> > This mechanism will get all registers state but this is required only once
> > per
> entry in QEMU.
>
> I don't understand this bit.
>
> >
> >
> > Option 2:
> > Define read_regs_type() and write_regs_type() for cases where it requires
> multiple registers updates:
> >
> > int read_regs_type((CPUArchState env, void *regs_ptr, int reg_type) {
> > -> reg_type are bitmap like: TIMER_REGS, DEBUG_REGS, PERFMON_REGS etc.
> >
> > Switch(reg_type) {
> > Case TIMER_REGS:
> > Struct *timer_regs = regs_ptr;
> > Ioctl(KVM_GET_SREGS, timer_regs, CPU_reg_type);
> > Break;
> >
> > Default:
> > Printf(error);
> > }
> >
> > Similarly will be the write function:
> > int write_regs_type((CPUArchState env, int reg_type) {
> > -> reg_type are bitmap like: TIMER_REGS, DEBUG_REGS, PERFMON_REGS etc.
> >
> > Switch(reg_type) {
> > Case TIMER_REGS:
> > Struct timer_regs;
> > Timer_regs.reg1 = env->timer_regs->reg1;
> > Timer_regs.reg2 = env->timer_regs->reg2;
> > Ioctl(env, KVM_SET_SREGS, &timer_regs, CPU_reg_type);
> > Break;
> >
> > Default:
> > Printf(error);
> > }
> >
> > Int timer_func(CPUxxState env)
> > {
> > Struct timer_regs;
> > read_regs_type((env, &timer_regs,TIMER_REGS);
> > //update env->timer_registers
> > Write_regs_type(env, TIMER_REGS)
> > }
> >
> > This will get one type of register_types and can cause multiple IOCTL per
> entry to QEMU.
>
> This is still too explicit. How about
>
> static inline void ppc_set_tsr(CPUState *env, target_ulong val) {
> env->kvm_sync_extra |= SYNC_TIMER_REGS;
> cpu_synchronize_registers(env);
> env->spr[SPR_TSR] = val;
> }
>
> static inline void ppc_set_tcr(CPUState *env, target_ulong val) {
> env->kvm_sync_extra |= SYNC_TIMER_REGS;
> cpu_synchronize_registers(env);
> env->spr[SPR_TCR] = val;
> }
>
> int timer_func(CPUState *env) {
> ppc_set_tsr(env, 0);
> ppc_set_tcr(env, 0);
> }
>
> The main benefit from this method is that we can do a simple search/replace to
> get rid of essentially all synchronization pitfalls where we might miss out on
> some code path that then doesn't synchronize its register set.
>
> > -------
> >
> > Keep on using GET/SET_ONE_REG when only one registers needed to be updated.
>
> We can always use ONE_REG for any of the variants.
>
> The main question I have is whether it makes sense to move from
>
> /* state subset only touched by the VCPU itself during runtime */
> #define KVM_PUT_RUNTIME_STATE 1
> /* state subset modified during VCPU reset */
> #define KVM_PUT_RESET_STATE 2
> /* full state set, modified during initialization or on vmload */
> #define KVM_PUT_FULL_STATE 3
>
> to a bitmap. So we would always only synchronize KVM_PUT_RUNTIME_STATE, but
> keep
> a bitmap as explained above around for any state that is more elaborate. For
> easy conversion, we could even add bits for SYNC_RESET and SYNC_FULL =
> (SYNC_RESET | SYNC_FULL_REAL).
>
> That should get rid of all the level based logic, giving us a lot more
> flexibility on what we want to synchronize.
>
>
> Alex
>
- Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix do_kvm_cpu_synchronize_state data integrity issue, Alexander Graf, 2013/01/03
- Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix do_kvm_cpu_synchronize_state data integrity issue, Jason J. Herne, 2013/01/03
- Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix do_kvm_cpu_synchronize_state data integrity issue, Alexander Graf, 2013/01/03
- Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix do_kvm_cpu_synchronize_state data integrity issue, Bhushan Bharat-R65777, 2013/01/04
- Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix do_kvm_cpu_synchronize_state data integrity issue, Alexander Graf, 2013/01/04
- Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix do_kvm_cpu_synchronize_state data integrity issue, Bhushan Bharat-R65777, 2013/01/04
- Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix do_kvm_cpu_synchronize_state data integrity issue, Alexander Graf, 2013/01/04
- Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix do_kvm_cpu_synchronize_state data integrity issue, Bhushan Bharat-R65777, 2013/01/04
- Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix do_kvm_cpu_synchronize_state data integrity issue, Alexander Graf, 2013/01/04
- Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix do_kvm_cpu_synchronize_state data integrity issue, Jason J. Herne, 2013/01/04
- Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix do_kvm_cpu_synchronize_state data integrity issue, Alexander Graf, 2013/01/04
- Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix do_kvm_cpu_synchronize_state data integrity issue, Jason J. Herne, 2013/01/04