qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] virsh dump (qemu guest memory dump?): KASLR enabled lin


From: Dave Young
Subject: Re: [Qemu-devel] virsh dump (qemu guest memory dump?): KASLR enabled linux guest support
Date: Mon, 14 Nov 2016 13:32:56 +0800
User-agent: Mutt/1.7.1 (2016-10-04)

On 11/09/16 at 04:38pm, Laszlo Ersek wrote:
> On 11/09/16 15:47, Daniel P. Berrange wrote:
> > On Wed, Nov 09, 2016 at 01:20:51PM +0100, Andrew Jones wrote:
> >> On Wed, Nov 09, 2016 at 11:58:19AM +0000, Daniel P. Berrange wrote:
> >>> On Wed, Nov 09, 2016 at 12:48:09PM +0100, Andrew Jones wrote:
> >>>> On Wed, Nov 09, 2016 at 11:37:35AM +0000, Daniel P. Berrange wrote:
> >>>>> On Wed, Nov 09, 2016 at 12:26:17PM +0100, Laszlo Ersek wrote:
> >>>>>> On 11/09/16 11:40, Andrew Jones wrote:
> >>>>>>> On Wed, Nov 09, 2016 at 11:01:46AM +0800, Dave Young wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> Latest linux kernel enabled kaslr to randomiz phys/virt memory
> >>>>>>>> addresses, we had some effort to support kexec/kdump so that crash
> >>>>>>>> utility can still works in case crashed kernel has kaslr enabled.
> >>>>>>>>
> >>>>>>>> But according to Dave Anderson virsh dump does not work, quoted 
> >>>>>>>> messages
> >>>>>>>> from Dave below:
> >>>>>>>>
> >>>>>>>> """
> >>>>>>>> with virsh dump, there's no way of even knowing that KASLR
> >>>>>>>> has randomized the kernel __START_KERNEL_map region, because there 
> >>>>>>>> is no
> >>>>>>>> virtual address information -- e.g., like "SYMBOL(_stext)" in the 
> >>>>>>>> kdump
> >>>>>>>> vmcoreinfo data to compare against the vmlinux file symbol value.
> >>>>>>>> Unless virsh dump can export some basic virtual memory data, which
> >>>>>>>> they say it can't, I don't see how KASLR can ever be supported.
> >>>>>>>> """
> >>>>>>>>
> >>>>>>>> I assume virsh dump is using qemu guest memory dump facility so it
> >>>>>>>> should be first addressed in qemu. Thus post this query to qemu devel
> >>>>>>>> list. If this is not correct please let me know.
> >>>>>>>>
> >>>>>>>> Could you qemu dump people make it work? Or we can not support virt 
> >>>>>>>> dump
> >>>>>>>> as long as KASLR being enabled. Latest Fedora kernel has enabled it 
> >>>>>>>> in x86_64.
> >>>>>>>>
> >>>>>>>
> >>>>>>> When the -kernel command line option is used, then it may be possible
> >>>>>>> to extract some information that could be used to supplement the 
> >>>>>>> memory
> >>>>>>> dump that dump-guest-memory provides. However, that would be a 
> >>>>>>> specific
> >>>>>>> use. In general, QEMU knows nothing about the guest kernel. It doesn't
> >>>>>>> know where it is in the disk image, and it doesn't even know if it's
> >>>>>>> Linux.
> >>>>>>>
> >>>>>>> Is there anything a guest userspace application could probe from e.g.
> >>>>>>> /proc that would work? If so, then the guest agent could gain a new
> >>>>>>> feature providing that.
> >>>>>>
> >>>>>> I fully agree. This is exactly what I suggested too, independently, in
> >>>>>> the downstream thread, before arriving at this upstream thread. Let me
> >>>>>> quote that email:
> >>>>>>
> >>>>>> On 11/09/16 12:09, Laszlo Ersek wrote:
> >>>>>>> [...] the dump-guest-memory QEMU command supports an option called
> >>>>>>> "paging". Here's its documentation, from the "qapi-schema.json" source
> >>>>>>> file:
> >>>>>>>
> >>>>>>>> # @paging: if true, do paging to get guest's memory mapping. This 
> >>>>>>>> allows
> >>>>>>>> #          using gdb to process the core file.
> >>>>>>>> #
> >>>>>>>> #          IMPORTANT: this option can make QEMU allocate several 
> >>>>>>>> gigabytes
> >>>>>>>> #                     of RAM. This can happen for a large guest, or a
> >>>>>>>> #                     malicious guest pretending to be large.
> >>>>>>>> #
> >>>>>>>> #          Also, paging=true has the following limitations:
> >>>>>>>> #
> >>>>>>>> #             1. The guest may be in a catastrophic state or can 
> >>>>>>>> have corrupted
> >>>>>>>> #                memory, which cannot be trusted
> >>>>>>>> #             2. The guest can be in real-mode even if paging is 
> >>>>>>>> enabled. For
> >>>>>>>> #                example, the guest uses ACPI to sleep, and ACPI 
> >>>>>>>> sleep state
> >>>>>>>> #                goes in real-mode
> >>>>>>>> #             3. Currently only supported on i386 and x86_64.
> >>>>>>>> #
> >>>>>>>
> >>>>>>> "virsh dump --memory-only" sets paging=false, for obvious reasons.
> >>>>>>>
> >>>>>>> [...] the dump-guest-memory command provides a raw snapshot of the
> >>>>>>> virtual machine's memory (and of the registers of the VCPUs); it is
> >>>>>>> not enlightened about the guest.
> >>>>>>>
> >>>>>>> If the additional information you are looking for can be retrieved
> >>>>>>> within the running Linux guest, using an appropriately privieleged
> >>>>>>> userspace process, then I would recommend considering an extension to
> >>>>>>> the qemu guest agent. The management layer (libvirt, [...]) could
> >>>>>>> first invoke the guest agent (a process with root privileges running
> >>>>>>> in the guest) from the host side, through virtio-serial. The new guest
> >>>>>>> agent command would return the information necessary to deal with
> >>>>>>> KASLR. Then the management layer would initiate the dump like always.
> >>>>>>> Finally, the extra information would be combined with (or placed
> >>>>>>> beside) the dump file in some way.
> >>>>>>>
> >>>>>>> So, this proposal would affect the guest agent and the management
> >>>>>>> layer (= libvirt).
> >>>>>>
> >>>>>> Given that we already dislike "paging=true", enlightening
> >>>>>> dump-guest-memory with even more guest-specific insight is the wrong
> >>>>>> approach, IMO. That kind of knowledge belongs to the guest agent.
> >>>>>
> >>>>> If you're trying to debug a hung/panicked guest, then using a guest
> >>>>> agent to fetch info is a complete non-starter as it'll be dead.
> 
> Yes, I realized this a while after posting...
> 
> >>>> So don't wait. Management software can make this query immediately
> >>>> after the guest agent goes live. The information needed won't change.
> 
> ... and then figured this would solve the problem.
> 
> >>> That doesn't help with trying to diagnose a crash during boot up, since
> >>> the guest agent isn't running till fairly late. I'm also concerned that
> >>> the QEMU guest agent is likely to be far from widely deployed in guests,
> 
> I have no hard data, but from the recent Fedora and RHEL-7 guest
> installations I've done, it seems like qga is installed automatically.
> (Not sure if that's because Anaconda realizes it's installing the OS in
> a VM.) Once I made sure there was an appropriate virtio-serial config in
> the domain XMLs, I could talk to the agents (mainly for fstrim's sake)
> immediately.
> 
> >>> so reliance on the guest agent will mean the dump facility is no longer
> >>> reliably available.
> >>>
> >>
> >> It'd still be reliably available and useable during early boot, just like
> >> it is now, for kernels that don't use KASLR. This proposal is only
> >> attempting to *also* address KASLR kernels, for which there is currently
> >> no support whatsoever. Call it a best-effort.
> >>
> >> Of course we can get support for [probably] early boot and
> >> guest-agent-less guests using KASLR too if we introduce a paravirt
> >> solution, requiring guest kernel and KVM changes. Is it worth it?
> > 
> > There's a standard for persistent storage that is intended to allow
> > the kernel to dump out data at time of crash:
> > 
> >    https://lwn.net/Articles/434821/
> > 
> > and there's some recent patches to provide a QEMU backend. Could we
> > leverage that facility to get the data we need from the guest kernel ?
> > 
> > Instead of only using pstore at time of crash, the kernel could see
> > that its running on KVM, and write out the paging data to pstore. So
> > when QEMU later generates a core dump, it can grab the corresponding
> > data from pstore backend ?
> > 
> > Still requires an extra device, to be configured, but at lesat we
> > would not have to invent yet another paravirt device ourselves, just
> > use the existing framework.
> 
> Not disagreeing, I'd just like to point out that the kernel can also
> crash before the extra device (the pstore driver) is configured
> (especially if the driver is built as a module).

Boot phase crash is also a problem for kdump, but hopefully the boot
phase crash will be found early and get fixed early. The run time
problems are harder, it will still be helpful.

I'm not a virt expert, but from my feeling comparint guest agent and
pstore I would vote for guest agent, it is ready to work on now, no?
For pstore I'm not sure how to make a pstore device for all guests. I
know uefi guest can use its nvram, but introducing some general pstore
sounds hard..

Thanks
Dave



reply via email to

[Prev in Thread] Current Thread [Next in Thread]