Re: [Qemu-block] [Qemu-devel] [PATCH] pflash: Only read non-zero parts o

qemu-block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [Qemu-devel] [PATCH] pflash: Only read non-zero parts o

From:	Xiang Zheng
Subject:	Re: [Qemu-block] [Qemu-devel] [PATCH] pflash: Only read non-zero parts of backend image
Date:	Thu, 9 May 2019 15:14:43 +0800
User-agent:	Mozilla/5.0 (Windows NT 6.1; WOW64; rv:64.0) Gecko/20100101 Thunderbird/64.0


On 2019/5/8 21:20, Markus Armbruster wrote:
> Laszlo Ersek <address@hidden> writes:
> 
>> Hi Markus,
>>
>> On 05/07/19 20:01, Markus Armbruster wrote:
>>> The subject is slightly misleading.  Holes read as zero.  So do
>>> non-holes full of zeroes.  The patch avoids reading the former, but
>>> still reads the latter.
>>>
>>> Xiang Zheng <address@hidden> writes:
>>>
>>>> Currently we fill the memory space with two 64MB NOR images when
>>>> using persistent UEFI variables on virt board. Actually we only use
>>>> a very small(non-zero) part of the memory while the rest significant
>>>> large(zero) part of memory is wasted.
>>>
>>> Neglects to mention that the "virt board" is ARM.
>>>
>>>> So this patch checks the block status and only writes the non-zero part
>>>> into memory. This requires pflash devices to use sparse files for
>>>> backends.
>>>
>>> I started to draft an improved commit message, but then I realized this
>>> patch can't work.
>>>
>>> The pflash_cfi01 device allocates its device memory like this:
>>>
>>>     memory_region_init_rom_device(
>>>         &pfl->mem, OBJECT(dev),
>>>         &pflash_cfi01_ops,
>>>         pfl,
>>>         pfl->name, total_len, &local_err);
>>>
>>> pflash_cfi02 is similar.
>>>
>>> memory_region_init_rom_device() calls
>>> memory_region_init_rom_device_nomigrate() calls qemu_ram_alloc() calls
>>> qemu_ram_alloc_internal() calls g_malloc0().  Thus, all the device
>>> memory gets written to even with this patch.
>>
>> As far as I can see, qemu_ram_alloc_internal() calls g_malloc0() only to
>> allocate the the new RAMBlock object called "new_block". The actual
>> guest RAM allocation occurs inside ram_block_add(), which is also called
>> by qemu_ram_alloc_internal().
> 
> You're right.  I should've read more attentively.
> 
>> One frame outwards the stack, qemu_ram_alloc() passes NULL to
>> qemu_ram_alloc_internal(), for the 4th ("host") parameter. Therefore, in
>> qemu_ram_alloc_internal(), we set "new_block->host" to NULL as well.
>>
>> Then in ram_block_add(), we take the (!new_block->host) branch, and call
>> phys_mem_alloc().
>>
>> Unfortunately, "phys_mem_alloc" is a function pointer, set with
>> phys_mem_set_alloc(). The phys_mem_set_alloc() function is called from
>> "target/s390x/kvm.c" (setting the function pointer to
>> legacy_s390_alloc()), so it doesn't apply in this case. Therefore we end
>> up calling the default qemu_anon_ram_alloc() function, through the
>> funcptr. (I think anyway.)
>>
>> And qemu_anon_ram_alloc() boils down to mmap() + MAP_ANONYMOUS, in
>> qemu_ram_mmap(). (Even on PPC64 hosts, because qemu_anon_ram_alloc()
>> passes (-1) for "fd".)
>>
>> I may have missed something, of course -- I obviously didn't test it,
>> just speculated from the source.
> 
> Thanks for your sleuthing!
> 
>>> I'm afraid you neglected to test.
> 
> Accusation actually unsupported.  I apologize, and replace it by a
> question: have you observed the improvement you're trying to achieve,
> and if yes, how?
> 

Yes, we need to create sparse files as the backing images for pflash device.
To create sparse files like:

   dd of="QEMU_EFI-pflash.raw" if="/dev/zero" bs=1M seek=64 count=0
   dd of="QEMU_EFI-pflash.raw" if="QEMU_EFI.fd" conv=notrunc

   dd of="empty_VARS.fd" if="/dev/zero" bs=1M seek=64 count=0

Start a VM with below commandline:

    -drive 
file=/usr/share/edk2/aarch64/QEMU_EFI-pflash.raw,if=pflash,format=raw,unit=0,readonly=on\
    -drive 
file=/usr/share/edk2/aarch64/empty_VARS.fd,if=pflash,format=raw,unit=1 \

Then observe the memory usage of the qemu process (THP is on).

1) Without this patch:
# cat /proc/`pidof qemu-system-aarch64`/smaps | grep AnonHugePages: | grep -v ' 
0 kB'
AnonHugePages:    706560 kB
AnonHugePages:      2048 kB
AnonHugePages:     65536 kB    // pflash memory device
AnonHugePages:     65536 kB    // pflash memory device
AnonHugePages:      2048 kB

# ps aux | grep qemu-system-aarch64
RSS: 879684

2) After applying this patch:
# cat /proc/`pidof qemu-system-aarch64`/smaps | grep AnonHugePages: | grep -v ' 
0 kB'
AnonHugePages:    700416 kB
AnonHugePages:      2048 kB
AnonHugePages:      2048 kB    // pflash memory device
AnonHugePages:      2048 kB    // pflash memory device
AnonHugePages:      2048 kB

# ps aux | grep qemu-system-aarch64
RSS: 744380

Obviously, there are at least 100MiB memory saved for each guest.

-- 

Thanks,
Xiang

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-block] [PATCH] pflash: Only read non-zero parts of backend image, Xiang Zheng, 2019/05/05
- Re: [Qemu-block] [PATCH] pflash: Only read non-zero parts of backend image, Peter Maydell, 2019/05/05
  - Re: [Qemu-block] [PATCH] pflash: Only read non-zero parts of backend image, Xiang Zheng, 2019/05/05
- Re: [Qemu-block] [Qemu-devel] [PATCH] pflash: Only read non-zero parts of backend image, Markus Armbruster, 2019/05/07
  - Re: [Qemu-block] [Qemu-devel] [PATCH] pflash: Only read non-zero parts of backend image, Laszlo Ersek, 2019/05/07
    - Re: [Qemu-block] [Qemu-devel] [PATCH] pflash: Only read non-zero parts of backend image, Markus Armbruster, 2019/05/08
    - Re: [Qemu-block] [Qemu-devel] [PATCH] pflash: Only read non-zero parts of backend image, Xiang Zheng <=
    - Re: [Qemu-block] [Qemu-devel] [PATCH] pflash: Only read non-zero parts of backend image, Markus Armbruster, 2019/05/09
    - Re: [Qemu-block] [Qemu-devel] [PATCH] pflash: Only read non-zero parts of backend image, Xiang Zheng, 2019/05/10
    - Re: [Qemu-block] [Qemu-devel] [PATCH] pflash: Only read non-zero parts of backend image, Markus Armbruster, 2019/05/10
    - Re: [Qemu-block] [Qemu-devel] [PATCH] pflash: Only read non-zero parts of backend image, Xiang Zheng, 2019/05/11
    - Re: [Qemu-block] [Qemu-devel] [PATCH] pflash: Only read non-zero parts of backend image, Markus Armbruster, 2019/05/13
    - Re: [Qemu-block] [Qemu-devel] [PATCH] pflash: Only read non-zero parts of backend image, Kevin Wolf, 2019/05/13
    - Re: [Qemu-block] [Qemu-devel] [PATCH] pflash: Only read non-zero parts of backend image, Markus Armbruster, 2019/05/13

Prev by Date: Re: [Qemu-block] [Qemu-devel] [PATCH] blockdev-backup: don't check aio_context too early
Next by Date: Re: [Qemu-block] [Qemu-devel] [PATCH] blockdev-backup: don't check aio_context too early
Previous by thread: Re: [Qemu-block] [Qemu-devel] [PATCH] pflash: Only read non-zero parts of backend image
Next by thread: Re: [Qemu-block] [Qemu-devel] [PATCH] pflash: Only read non-zero parts of backend image
Index(es):
- Date
- Thread