[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2 0/8] nvdimm: hotplug support
From: |
Igor Mammedov |
Subject: |
Re: [Qemu-devel] [PATCH v2 0/8] nvdimm: hotplug support |
Date: |
Tue, 11 Oct 2016 14:32:23 +0200 |
On Mon, 10 Oct 2016 21:57:55 +0800
Xiao Guangrong <address@hidden> wrote:
> On 10/10/2016 08:59 PM, Igor Mammedov wrote:
> > On Sat, 8 Oct 2016 16:34:06 +0800
> > Xiao Guangrong <address@hidden> wrote:
> >
> >> On 10/03/2016 09:48 PM, Igor Mammedov wrote:
> >>> On Fri, 12 Aug 2016 14:54:02 +0800
> >>> Xiao Guangrong <address@hidden> wrote:
> >>>
> >>> General design issue in this series is regenerating
> >>> _FIT data every time inside of _FIT read loop.
> >>>
> >>> The issue here is that if FIT data doesn't fit in one page
> >>> RFIT would be called several times resulting in calling
> >>> nvdimm_dsm_func_read_fit() N-times, and in between
> >>> these N exits generated FIT data could change
> >>> (for example if another NVDIMM has been hotpluged in between)
> >>> resulting in corrupted concatenated FIT buffer returned by
> >>> _FIT method.
> >>> So one need either to block follow up hotplug or make sure
> >>> that FIT data regenerated/updated only once per _FIT
> >>> invocation and stay immutable until _FIT is completed
> >>> (maybe RCU protected buffer).
> >>>
> >
> >
> >>> Regenerating/updating inside qemu could also be shared
> >>> with NFIT table creation path so that both cold/hotplug
> >>> would reuse the same data which are updated only when
> >>> a NVDIMM is added/removed.
> > try to take into account this point so that FIT data
> > aren't regenerated on every RFIT invocation.
>
> Okay.
>
> The reason that the FIT was always regenerated is avoiding saving
> FIT during live-migration. Well, it seems that we are hard to
> avoid it now. :)
>
> >
> >
> >> Explicitly sync up QEMU and OSPM is hard as OSPM is running
> >> in a VM which is vulnerable and can not be expected always
> >> triggering correct sequence.
> >>
> >> So how about introducing generation-number which is updated
> >> for each hotplug event and a new DSM function reserved to
> >> get this number by OSPM. Then the ACPI code is like this:
> >>
> >> restart:
> >> old-number = DSM(Read_NUM);
> >> FIT = DSM(FIT)
> >> new-number = DSM(Read_NUM);
> >>
> >> if (old-number != new-number)
> >> goto restart;
> >>
> >> return FIT
> > It's a bit complicated.
> >
>
> Hmm, how about this algorithm:
>
> struct fit_data {
> u64 generation_number;
> GArray *fit;
> };
>
> struct fit_data *fit;
> u64 current_number;
>
> The update-side (hotplug happens):
> new_fit = malloc();
>
> old_fit = rcu_dereference(fit);
>
> new_fit->generation_number = old_fit->generation_number + 1;
> new_fit->fit = ...generate-fit...;
>
> rcu_assign(fit, new_fit);
> sync_rcu()
> free(old_fit);
>
>
> On the read-side (DSM issued by OSPM):
> {
> rcu_read_lock()
>
> fit = rcu_dereference(fit);
>
> /* if it is the first time issuing RFIT. */
> if (fit-read-position == 0)
> current_number = fit->generation_number;
> else if (current_number != fit->generation_number)
> return FAIL_FIT_CHANGED;
>
> ... fill returned info with fit->fit...;
>
> rcu_read_unlock()
> }
>
> Of course, @fit and @current_number should be persistent during live
> migration.
you can drop RCU and @current_number,
and @fit could be exactly recreated on target side from
-device nvdim ...
set of options, which must include all currently present
(including hotplugged) devices from source side.
It's sufficient to invalidate and restart DMA transfer in flight.
i.e.
pc_dimm_memory_plug ()
-> regenerate_fit()
-> set local flag @fit-dirty
-> start QEMU.fit_read()
-> if offset == 0 ? clear @fit-dirty
make sure fit_read() won't conflict with regenerate_fit()
either plain mutex or RCU would do the job
...
OSPM.rfit()
...
if (@fit-dirty)
abort/restart DMA from start
In migration case, the target would have @fit-dirty set thanks to
pc_dimm_memory_plug () -> regenerate_fit() and any DMA in flight
would be restarted.
That's a little bit inefficient as source_fit exactly matches
target_fit and DMA could continue, but it would save us from
having fit specific migration state to transfer and maintain.