qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 06/11] nvdimm acpi: initialize the resource u


From: Xiao Guangrong
Subject: Re: [Qemu-devel] [PATCH v2 06/11] nvdimm acpi: initialize the resource used by NVDIMM ACPI
Date: Wed, 17 Feb 2016 10:04:18 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1



On 02/16/2016 07:00 PM, Igor Mammedov wrote:
On Tue, 16 Feb 2016 02:35:41 +0800
Xiao Guangrong <address@hidden> wrote:

On 02/16/2016 01:24 AM, Igor Mammedov wrote:
On Mon, 15 Feb 2016 23:53:13 +0800
Xiao Guangrong <address@hidden> wrote:

On 02/15/2016 09:32 PM, Igor Mammedov wrote:
On Mon, 15 Feb 2016 13:45:59 +0200
"Michael S. Tsirkin" <address@hidden> wrote:

On Mon, Feb 15, 2016 at 11:47:42AM +0100, Igor Mammedov wrote:
On Mon, 15 Feb 2016 18:13:38 +0800
Xiao Guangrong <address@hidden> wrote:

On 02/15/2016 05:18 PM, Michael S. Tsirkin wrote:
On Mon, Feb 15, 2016 at 10:11:05AM +0100, Igor Mammedov wrote:
On Sun, 14 Feb 2016 13:57:27 +0800
Xiao Guangrong <address@hidden> wrote:

On 02/08/2016 07:03 PM, Igor Mammedov wrote:
On Wed, 13 Jan 2016 02:50:05 +0800
Xiao Guangrong <address@hidden> wrote:

32 bits IO port starting from 0x0a18 in guest is reserved for NVDIMM
ACPI emulation. The table, NVDIMM_DSM_MEM_FILE, will be patched into
NVDIMM ACPI binary code

OSPM uses this port to tell QEMU the final address of the DSM memory
and notify QEMU to emulate the DSM method
Would you need to pass control to QEMU if each NVDIMM had its whole
label area MemoryRegion mapped right after its storage MemoryRegion?


No, label data is not mapped into guest's address space and it only
can be accessed by DSM method indirectly.
Yep, per spec label data should be accessed via _DSM but question
wasn't about it,

Ah, sorry, i missed your question.

Why would one map only 4Kb window and serialize label data
via it if it could be mapped as whole, that way _DMS method will be
much less complicated and there won't be need to add/support a protocol
for its serialization.


Is it ever accessed on data path? If not I prefer the current approach:

The label data is only accessed via two DSM commands - Get Namespace Label
Data and Set Namespace Label Data, no other place need to be emulated.

limit the window used, the serialization protocol seems rather simple.


Yes.

Label data is at least 128k which is big enough for BIOS as it allocates
memory at 0 ~ 4G which is tight region. It also needs guest OS to support
lager max-xfer (the max size that can be transferred one time), the size
in current Linux NVDIMM driver is 4k.

However, using lager DSM buffer can help us to simplify NVDIMM hotplug for
the case that too many nvdimm devices present in the system and their FIT
info can not be filled into one page. Each PMEM-only device needs 0xb8 bytes
and we can append 256 memory devices at most, so 12 pages are needed to
contain this info. The prototype we implemented is using ourself-defined
protocol to read piece of _FIT and concatenate them before return to Guest,
please refer to:
https://github.com/xiaogr/qemu/commit/c46ce01c8433ac0870670304360b3c4aa414143a

As 12 pages are not small region for BIOS and the _FIT size may be extended in 
the
future development (eg, if PBLK is introduced) i am not sure if we need this. Of
course, another approach to simplify it is that we limit the number of NVDIMM
device to make sure their _FIT < 4k.
My suggestion is not to have only one label area for every NVDIMM but
rather to map each label area right after each NVDIMM's data memory.
That way _DMS can be made non-serialized and guest could handle
label data in parallel.

I think that alignment considerations would mean we are burning up
1G of phys address space for this. For PAE we only have 64G
of this address space, so this would be a problem.
That's true that it will burning away address space, however that
just means that PAE guests would not be able to handle as many
NVDIMMs as 64bit guests. The same applies to DIMMs as well, with
alignment enforced. If one needs more DIMMs he/she can switch
to 64bit guest to use them.

It's trade of inefficient GPA consumption vs efficient NVDIMMs access.
Also with fully mapped label area for each NVDIMM we don't have to
introduce and maintain any guest visible serialization protocol
(protocol for serializing _DSM via 4K window) which becomes ABI.

It's true for label access but it is not for the long term as we will
need to support other _DSM commands such as vendor specific command,
PBLK dsm command, also NVDIMM MCE related commands will be introduced
in the future, so we will come back here at that time. :(
I believe for block mode NVDIMM would also need per NVDIMM mapping
for performance reasons (parallel access).
As for the rest could that commands go via MMIO that we usually
use for control path?

So both input data and output data go through single MMIO, we need to
introduce a protocol to pass these data, that is complex?

And is any MMIO we can reuse (more complexer?) or we should allocate this
MMIO page (the old question - where to allocated?)?
Maybe you could reuse/extend memhotplug IO interface,
or alternatively as Michael suggested add a vendor specific PCI_Config,
I'd suggest PM device for that (hw/acpi/[piix4.c|ihc9.c])
which I like even better since you won't need to care about which ports
to allocate at all.

Well, if Michael does not object, i will do it in the next version. :)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]