qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix do_kvm_cpu_synchronize_sta


From: Bhushan Bharat-R65777
Subject: Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix do_kvm_cpu_synchronize_state data integrity issue
Date: Fri, 4 Jan 2013 10:32:19 +0000


> -----Original Message-----
> From: Alexander Graf [mailto:address@hidden
> Sent: Friday, January 04, 2013 3:59 PM
> To: Bhushan Bharat-R65777
> Cc: Marcelo Tosatti; address@hidden; Christian Borntraeger; Anthony
> Liguori; address@hidden qemu-devel
> Subject: Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix
> do_kvm_cpu_synchronize_state data integrity issue
> 
> 
> On 04.01.2013, at 11:23, Bhushan Bharat-R65777 wrote:
> 
> >
> >
> >> -----Original Message-----
> >> From: Alexander Graf [mailto:address@hidden
> >> Sent: Friday, January 04, 2013 2:11 PM
> >> To: Bhushan Bharat-R65777
> >> Cc: Marcelo Tosatti; address@hidden; Christian
> >> Borntraeger; Anthony Liguori; address@hidden qemu-devel
> >> Subject: Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix
> >> do_kvm_cpu_synchronize_state data integrity issue
> >>
> >>
> >> On 04.01.2013, at 05:22, Bhushan Bharat-R65777 wrote:
> >>
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Marcelo Tosatti [mailto:address@hidden
> >>>> Sent: Friday, January 04, 2013 7:08 AM
> >>>> To: Alexander Graf
> >>>> Cc: address@hidden; Christian Borntraeger; Anthony
> >>>> Liguori; qemu- address@hidden qemu-devel; Bhushan Bharat-R65777
> >>>> Subject: Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix
> >>>> do_kvm_cpu_synchronize_state data integrity issue
> >>>>
> >>>> On Thu, Jan 03, 2013 at 08:09:22PM +0100, Alexander Graf wrote:
> >>>>>
> >>>>> On 03.01.2013, at 19:48, Jason J. Herne wrote:
> >>>>>
> >>>>>> On 01/03/2013 08:56 AM, Alexander Graf wrote:
> >>>>>>>> static void do_kvm_cpu_synchronize_state(void *_args)
> >>>>>>>>> {
> >>>>>>>>>   struct kvm_cpu_syncstate_args *args = _args;
> >>>>>>>>> +    CPUArchState *env = args->env;
> >>>>>>>>> +    int register_level = args->register_level;
> >>>>>>>>>
> >>>>>>> This probably becomes more readable if we explicitly revert back
> >>>>>>> to
> >>>> unsynced state first:
> >>>>>>>
> >>>>>>> /* Write back local modifications at our current level */ if
> >>>>>>> (register_level > env->kvm_vcpu_dirty) {
> >>>>>>>   kvm_arch_put_registers(...);
> >>>>>>>   env->kvm_vcpu_dirty = 0;
> >>>>>>> }
> >>>>>>>
> >>>>>>> and then do the sync we are requested to do:
> >>>>>>>
> >>>>>>> if (!env->kvm_vcpu_dirty) {
> >>>>>>>   ...
> >>>>>>> }
> >>>>>>
> >>>>>> I agree, but only if we add a second conditional to the if 1st
> >>>>>> statement  as
> >>>> such:
> >>>>>>
> >>>>>> if (args->env->kvm_vcpu_dirty && register_level >
> >>>>>> env->kvm_vcpu_dirty)
> >>>>>>
> >>>>>> This is to cover the case where the caller is asking for register level
> "1"
> >>>> and we're already dirty at level "2". In this case, nothing should
> >>>> happen and we'll need the "args->env->kvm_vcpu_dirty" to ensure
> >>>> that is the
> >> case.
> >>>>>
> >>>>> As before, I'd prefer to make this explicit:
> >>>>>
> >>>>>>
> >>>>>> static void do_kvm_cpu_synchronize_state(void *_args) {  struct
> >>>>>> kvm_cpu_syncstate_args *args = _args;  CPUArchState *env =
> >>>>>> args->env;  int register_level = args->register_level;
> >>>>>
> >>>>> if (register_level > env->kvm_vcpu_dirty) {
> >>>>>   /* We are more dirty than we need to - all is well */
> >>>>>   return;
> >>>>> }
> >>>>>
> >>>>>>
> >>>>>>  /* Write back local modifications at our current level */  if
> >>>>>> (args->env->kvm_vcpu_dirty && register_level > env->kvm_vcpu_dirty) {
> >>>>>>      kvm_arch_put_registers(env, env->kvm_vcpu_dirty);
> >>>>>>      env->kvm_vcpu_dirty = 0;
> >>>>>>  }
> >>>>>>
> >>>>>>  if (!args->env->kvm_vcpu_dirty) {
> >>>>>>      kvm_arch_get_registers(env, register_level);
> >>>>>>      env->kvm_vcpu_dirty = register_level;  } }
> >>>>>>
> >>>>>> Do you agree?  Thanks for your time. :)
> >>>>>
> >>>>> Please also check out the discussions I've had with Bharat about
> >>>>> his watchdog
> >>>> patches. There we need a mechanism to synchronize registers only
> >>>> when we actually need to, in order to avoid potential race
> >>>> conditions with a kernel timer.
> >>>>>
> >>>>> That particular case doesn't work well with levels. We can have
> >>>>> multiple
> >>>> different potential race producers in the kernel that we need to
> >>>> avoid individually, so we can't always synchronize all of them when
> >>>> only one of them needs to be synchronized.
> >>>>>
> >>>>> The big question is what we should be doing about this. We
> >>>>> basically have 3
> >>>> options:
> >>>>>
> >>>>> * implement levels, treat racy registers as "manually
> >>>>> synchronized", as
> >>>> Bharat's latest patch set does
> >>>>> * implement levels, add a bitmap for additional special
> >>>>> synchronization bits
> >>>>> * replace levels by bitmap
> >>>>>
> >>>>> I'm quite frankly not sure which one of the 3 would be the best way
> forward.
> >>>>>
> >>>>>
> >>>>> Alex
> >>>>
> >>>> Implement read/write accessors for individual registers, similarly
> >>>> to how its done in the kernel. See kvm_cache_regs.h.
> >>>
> >>> Read/write for individual registers can be heavy for cases where
> >>> multiple
> >> register needed to be updated. Is not it?
> >>
> >> It depends. We can still have classes of registers to synchronize
> >> that we could even synchronize using MANY_REG. For writes, things
> >> become even faster with individual accessors since we can easily
> >> coalesce all register writes until we hit the vcpu again into a MANY_REG
> array and write them in one go.
> >>
> >>>
> >>> For cases where multiple register needed to be synchronized then I
> >>> would like
> >> to elaborate the options as:
> >>>
> >>> Option 1:
> >>> int timer_func(CPUArchState env)
> >>> {
> >>>   cpu_synchronize_state(env);
> >>>   //update env->timer_registers
> >>>   env->updated_regs_type |= TIMER_REGS_TYPE_BIT;
> >>
> >> To scale this one, it's probably also more clever to swap the logic:
> >>
> >>    env->kvm_sync_extra |= SYNC_TIMER_REGS;
> >>    cpu_synchronize_state(env); /* gets timer regs */
> >>    /* update env->timer_registers */
> >>    /* timer regs will be synchronized later because kvm_sync_extra
> >> has SYNC_TIMER_REGS set */
> >>
> >> Your case right now is special in that we don't need to get any
> >> register state, but only write it. However, if we do
> >>
> >>    cpu_synchronize_state(env); /* will not get timer regs */
> >>    env->kvm_sync_extra |= SYNC_TIMER_REGS;
> >>
> >> then the next
> >>
> >>    cpu_synchronize_state(env);
> >>
> >> would overwrite the timer values again. So we would also need to
> >> indicate that the bits are dirty:
> >>
> >>    cpu_synchronize_state(env); /* will not get timer regs */
> >>    env->kvm_sync_extra |= SYNC_TIMER_REGS;
> >>    env->kvm_sync_dirty |= SYNC_TIMER_REGS;
> >>
> >> which cpu_synchronize_state() could use to make sure it doesn't read
> >> back any timer regs again.
> >
> > yes
> >
> >> It would obviously also have to set it in its normal operation when
> >> it executed first ;).
> >
> > We also need get all registers like for debugging and we should have flag 
> > like
> ALL_REGS etc.
> 
> Well, we could just take the current level of "general purpose registers" as 
> one
> flag.
> 
> >
> > But I general I agree with the idea.
> >
> >>
> >>> }
> >>>
> >>> Change the kvm_arch_put_registers() to add followings:
> >>>
> >>> int kvm_arch_put_registers(CPUArchState *env, int level) {
> >>>
> >>>   If (env->updated_regs_type) {
> >>>           struct kvm_sregs sregs;
> >>>           If (env->updated_regs_type & TIMER_REGS_TYPE_BIT) {
> >>>                   // update sregs->timer_registers;
> >>>                   // may be we can have a bitmap to tell kernel for what
> >> actually updated
> >>>           }
> >>>           If (env->updated_regs_type & XX_CPU_REGS_TYPE) {
> >>>                   // update respective registers in sregs
> >>>           }
> >>>
> >>>           ioctl(env, KVM_GET_SREGS, &sregs);
> >>>   }
> >>> }
> >>>
> >>> This mechanism will get all registers state but this is required
> >>> only once per
> >> entry in QEMU.
> >>
> >> I don't understand this bit
> >>
> >>>
> >>>
> >>> Option 2:
> >>> Define read_regs_type() and write_regs_type() for cases where it
> >>> requires
> >> multiple registers updates:
> >>>
> >>> int read_regs_type((CPUArchState env, void *regs_ptr, int reg_type) {
> >>>    -> reg_type are bitmap like: TIMER_REGS, DEBUG_REGS, PERFMON_REGS etc.
> >>>
> >>>   Switch(reg_type) {
> >>>   Case TIMER_REGS:
> >>>           Struct *timer_regs = regs_ptr;
> >>>           Ioctl(KVM_GET_SREGS, timer_regs, CPU_reg_type);
> >>>           Break;
> >>>
> >>>   Default:
> >>>           Printf(error);
> >>> }
> >>>
> >>> Similarly will be the write function:
> >>> int write_regs_type((CPUArchState env, int reg_type) {
> >>>    -> reg_type are bitmap like: TIMER_REGS, DEBUG_REGS, PERFMON_REGS etc.
> >>>
> >>>   Switch(reg_type) {
> >>>   Case TIMER_REGS:
> >>>           Struct timer_regs;
> >>>           Timer_regs.reg1 = env->timer_regs->reg1;
> >>>           Timer_regs.reg2 = env->timer_regs->reg2;
> >>>           Ioctl(env, KVM_SET_SREGS, &timer_regs, CPU_reg_type);
> >>>           Break;
> >>>
> >>>   Default:
> >>>           Printf(error);
> >>> }
> >>>
> >>> Int timer_func(CPUxxState env)
> >>> {
> >>>   Struct timer_regs;
> >>>   read_regs_type((env, &timer_regs,TIMER_REGS);
> >>>   //update env->timer_registers
> >>>   Write_regs_type(env, TIMER_REGS)
> >>> }
> >>>
> >>> This will get one type of register_types and can cause multiple
> >>> IOCTL per
> >> entry to QEMU.
> >>
> >> This is still too explicit. How about
> >>
> >> static inline void ppc_set_tsr(CPUState *env, target_ulong val) {
> >>    env->kvm_sync_extra |= SYNC_TIMER_REGS;
> >>    cpu_synchronize_registers(env);
> >>    env->spr[SPR_TSR] = val;
> >
> > You also want env->kvm_sync_dirty also, right?
> 
> Not quite, since SYNC_TIMER_REGS spans more than only TSR. So we need to make
> sure we fetch the non-TSR register values before we can declare TIMER_REGS as
> dirty.

Right , I actually want communicate mark dirty the TIMER_REGS only :)

> 
> >
> >> }
> >>
> >> static inline void ppc_set_tcr(CPUState *env, target_ulong val) {
> >>    env->kvm_sync_extra |= SYNC_TIMER_REGS;
> >>    cpu_synchronize_registers(env);
> >>    env->spr[SPR_TCR] = val;
> >
> > Same as above
> >
> >> }
> >>
> >> int timer_func(CPUState *env) {
> >>    ppc_set_tsr(env, 0);
> >
> > SYNC_TIMERS_REGS are get only once on first call.
> 
> Sure, but that's implicit. A user of these functions doesn't have to care 
> which
> one comes first.

Agree

-Bharat

> 
> 
> Alex
> 





reply via email to

[Prev in Thread] Current Thread [Next in Thread]