qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] virsh dump (qemu guest memory dump?): KASLR enabled lin


From: Andrew Jones
Subject: Re: [Qemu-devel] virsh dump (qemu guest memory dump?): KASLR enabled linux guest support
Date: Mon, 14 Nov 2016 10:47:00 +0100
User-agent: Mutt/1.6.0.1 (2016-04-01)

On Mon, Nov 14, 2016 at 01:32:56PM +0800, Dave Young wrote:
> On 11/09/16 at 04:38pm, Laszlo Ersek wrote:
> > On 11/09/16 15:47, Daniel P. Berrange wrote:
> > > On Wed, Nov 09, 2016 at 01:20:51PM +0100, Andrew Jones wrote:
> > >> On Wed, Nov 09, 2016 at 11:58:19AM +0000, Daniel P. Berrange wrote:
> > >>> On Wed, Nov 09, 2016 at 12:48:09PM +0100, Andrew Jones wrote:
> > >>>> On Wed, Nov 09, 2016 at 11:37:35AM +0000, Daniel P. Berrange wrote:
> > >>>>> On Wed, Nov 09, 2016 at 12:26:17PM +0100, Laszlo Ersek wrote:
> > >>>>>> On 11/09/16 11:40, Andrew Jones wrote:
> > >>>>>>> On Wed, Nov 09, 2016 at 11:01:46AM +0800, Dave Young wrote:
> > >>>>>>>> Hi,
> > >>>>>>>>
> > >>>>>>>> Latest linux kernel enabled kaslr to randomiz phys/virt memory
> > >>>>>>>> addresses, we had some effort to support kexec/kdump so that crash
> > >>>>>>>> utility can still works in case crashed kernel has kaslr enabled.
> > >>>>>>>>
> > >>>>>>>> But according to Dave Anderson virsh dump does not work, quoted 
> > >>>>>>>> messages
> > >>>>>>>> from Dave below:
> > >>>>>>>>
> > >>>>>>>> """
> > >>>>>>>> with virsh dump, there's no way of even knowing that KASLR
> > >>>>>>>> has randomized the kernel __START_KERNEL_map region, because there 
> > >>>>>>>> is no
> > >>>>>>>> virtual address information -- e.g., like "SYMBOL(_stext)" in the 
> > >>>>>>>> kdump
> > >>>>>>>> vmcoreinfo data to compare against the vmlinux file symbol value.
> > >>>>>>>> Unless virsh dump can export some basic virtual memory data, which
> > >>>>>>>> they say it can't, I don't see how KASLR can ever be supported.
> > >>>>>>>> """
> > >>>>>>>>
> > >>>>>>>> I assume virsh dump is using qemu guest memory dump facility so it
> > >>>>>>>> should be first addressed in qemu. Thus post this query to qemu 
> > >>>>>>>> devel
> > >>>>>>>> list. If this is not correct please let me know.
> > >>>>>>>>
> > >>>>>>>> Could you qemu dump people make it work? Or we can not support 
> > >>>>>>>> virt dump
> > >>>>>>>> as long as KASLR being enabled. Latest Fedora kernel has enabled 
> > >>>>>>>> it in x86_64.
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>> When the -kernel command line option is used, then it may be 
> > >>>>>>> possible
> > >>>>>>> to extract some information that could be used to supplement the 
> > >>>>>>> memory
> > >>>>>>> dump that dump-guest-memory provides. However, that would be a 
> > >>>>>>> specific
> > >>>>>>> use. In general, QEMU knows nothing about the guest kernel. It 
> > >>>>>>> doesn't
> > >>>>>>> know where it is in the disk image, and it doesn't even know if it's
> > >>>>>>> Linux.
> > >>>>>>>
> > >>>>>>> Is there anything a guest userspace application could probe from 
> > >>>>>>> e.g.
> > >>>>>>> /proc that would work? If so, then the guest agent could gain a new
> > >>>>>>> feature providing that.
> > >>>>>>
> > >>>>>> I fully agree. This is exactly what I suggested too, independently, 
> > >>>>>> in
> > >>>>>> the downstream thread, before arriving at this upstream thread. Let 
> > >>>>>> me
> > >>>>>> quote that email:
> > >>>>>>
> > >>>>>> On 11/09/16 12:09, Laszlo Ersek wrote:
> > >>>>>>> [...] the dump-guest-memory QEMU command supports an option called
> > >>>>>>> "paging". Here's its documentation, from the "qapi-schema.json" 
> > >>>>>>> source
> > >>>>>>> file:
> > >>>>>>>
> > >>>>>>>> # @paging: if true, do paging to get guest's memory mapping. This 
> > >>>>>>>> allows
> > >>>>>>>> #          using gdb to process the core file.
> > >>>>>>>> #
> > >>>>>>>> #          IMPORTANT: this option can make QEMU allocate several 
> > >>>>>>>> gigabytes
> > >>>>>>>> #                     of RAM. This can happen for a large guest, 
> > >>>>>>>> or a
> > >>>>>>>> #                     malicious guest pretending to be large.
> > >>>>>>>> #
> > >>>>>>>> #          Also, paging=true has the following limitations:
> > >>>>>>>> #
> > >>>>>>>> #             1. The guest may be in a catastrophic state or can 
> > >>>>>>>> have corrupted
> > >>>>>>>> #                memory, which cannot be trusted
> > >>>>>>>> #             2. The guest can be in real-mode even if paging is 
> > >>>>>>>> enabled. For
> > >>>>>>>> #                example, the guest uses ACPI to sleep, and ACPI 
> > >>>>>>>> sleep state
> > >>>>>>>> #                goes in real-mode
> > >>>>>>>> #             3. Currently only supported on i386 and x86_64.
> > >>>>>>>> #
> > >>>>>>>
> > >>>>>>> "virsh dump --memory-only" sets paging=false, for obvious reasons.
> > >>>>>>>
> > >>>>>>> [...] the dump-guest-memory command provides a raw snapshot of the
> > >>>>>>> virtual machine's memory (and of the registers of the VCPUs); it is
> > >>>>>>> not enlightened about the guest.
> > >>>>>>>
> > >>>>>>> If the additional information you are looking for can be retrieved
> > >>>>>>> within the running Linux guest, using an appropriately privieleged
> > >>>>>>> userspace process, then I would recommend considering an extension 
> > >>>>>>> to
> > >>>>>>> the qemu guest agent. The management layer (libvirt, [...]) could
> > >>>>>>> first invoke the guest agent (a process with root privileges running
> > >>>>>>> in the guest) from the host side, through virtio-serial. The new 
> > >>>>>>> guest
> > >>>>>>> agent command would return the information necessary to deal with
> > >>>>>>> KASLR. Then the management layer would initiate the dump like 
> > >>>>>>> always.
> > >>>>>>> Finally, the extra information would be combined with (or placed
> > >>>>>>> beside) the dump file in some way.
> > >>>>>>>
> > >>>>>>> So, this proposal would affect the guest agent and the management
> > >>>>>>> layer (= libvirt).
> > >>>>>>
> > >>>>>> Given that we already dislike "paging=true", enlightening
> > >>>>>> dump-guest-memory with even more guest-specific insight is the wrong
> > >>>>>> approach, IMO. That kind of knowledge belongs to the guest agent.
> > >>>>>
> > >>>>> If you're trying to debug a hung/panicked guest, then using a guest
> > >>>>> agent to fetch info is a complete non-starter as it'll be dead.
> > 
> > Yes, I realized this a while after posting...
> > 
> > >>>> So don't wait. Management software can make this query immediately
> > >>>> after the guest agent goes live. The information needed won't change.
> > 
> > ... and then figured this would solve the problem.
> > 
> > >>> That doesn't help with trying to diagnose a crash during boot up, since
> > >>> the guest agent isn't running till fairly late. I'm also concerned that
> > >>> the QEMU guest agent is likely to be far from widely deployed in guests,
> > 
> > I have no hard data, but from the recent Fedora and RHEL-7 guest
> > installations I've done, it seems like qga is installed automatically.
> > (Not sure if that's because Anaconda realizes it's installing the OS in
> > a VM.) Once I made sure there was an appropriate virtio-serial config in
> > the domain XMLs, I could talk to the agents (mainly for fstrim's sake)
> > immediately.
> > 
> > >>> so reliance on the guest agent will mean the dump facility is no longer
> > >>> reliably available.
> > >>>
> > >>
> > >> It'd still be reliably available and useable during early boot, just like
> > >> it is now, for kernels that don't use KASLR. This proposal is only
> > >> attempting to *also* address KASLR kernels, for which there is currently
> > >> no support whatsoever. Call it a best-effort.
> > >>
> > >> Of course we can get support for [probably] early boot and
> > >> guest-agent-less guests using KASLR too if we introduce a paravirt
> > >> solution, requiring guest kernel and KVM changes. Is it worth it?
> > > 
> > > There's a standard for persistent storage that is intended to allow
> > > the kernel to dump out data at time of crash:
> > > 
> > >    https://lwn.net/Articles/434821/
> > > 
> > > and there's some recent patches to provide a QEMU backend. Could we
> > > leverage that facility to get the data we need from the guest kernel ?
> > > 
> > > Instead of only using pstore at time of crash, the kernel could see
> > > that its running on KVM, and write out the paging data to pstore. So
> > > when QEMU later generates a core dump, it can grab the corresponding
> > > data from pstore backend ?
> > > 
> > > Still requires an extra device, to be configured, but at lesat we
> > > would not have to invent yet another paravirt device ourselves, just
> > > use the existing framework.
> > 
> > Not disagreeing, I'd just like to point out that the kernel can also
> > crash before the extra device (the pstore driver) is configured
> > (especially if the driver is built as a module).
> 
> Boot phase crash is also a problem for kdump, but hopefully the boot
> phase crash will be found early and get fixed early. The run time
> problems are harder, it will still be helpful.
> 
> I'm not a virt expert, but from my feeling comparint guest agent and
> pstore I would vote for guest agent, it is ready to work on now, no?
> For pstore I'm not sure how to make a pstore device for all guests. I
> know uefi guest can use its nvram, but introducing some general pstore
> sounds hard..
>

Nothing is stopping us from doing both, eventually. Care should be taken
on the management side to make it general enough. It should be designed
such that it can use guest-agent now, but in no way is bound to guest-
agent. We can decide later if we want to replace guest-agent with some
paravirt solution.

Nothing is blocking guest-agent patches now, that I know of.

Thanks,
drew



reply via email to

[Prev in Thread] Current Thread [Next in Thread]