[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug 1905562] Re: Guest seems suspended after host freed memory for it u
From: |
Launchpad Bug Tracker |
Subject: |
[Bug 1905562] Re: Guest seems suspended after host freed memory for it using oom-killer |
Date: |
Sat, 10 Jul 2021 04:17:20 -0000 |
[Expired for QEMU because there has been no activity for 60 days.]
** Changed in: qemu
Status: Incomplete => Expired
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1905562
Title:
Guest seems suspended after host freed memory for it using oom-killer
Status in QEMU:
Expired
Bug description:
Host: qemu 5.1.0, linux 5.5.13
Guest: Windows 7 64-bit
This guest ran a memory intensive process, and triggered oom-killer on
host. Luckily, it killed chromium. My understanding is this should
mean qemu should have continued running unharmed. But, the spice
connection shows the host system clock is stuck at the exact time oom-
killer was triggered. The host is completely unresponsive.
I can telnet to the qemu monitor. "info status" shows "running".
But, multiple times running "info registers -a" and saving the output
to text files shows the registers are 100% unchanged, so it's not
really running.
On the host, top shows around 4% CPU usage by qemu. strace shows
about 1,000 times a second, these 6 lines repeat:
0.000698 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c10) = 0 <0.000010>
0.000034 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c60) = 0 <0.000009>
0.000031 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c20) = 0 <0.000007>
0.000028 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c70) = 0 <0.000007>
0.000030 ppoll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7,
events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=11, events
=POLLIN}, {fd=16, events=POLLIN}, {fd=32, events=POLLIN}, {fd=34,
events=POLLIN}, {fd=39, events=POLLIN}, {fd=40, events=POLLIN}, {fd=41,
events=POLLI N}, {fd=42, events=POLLIN}, {fd=43, events=POLLIN},
{fd=44, events=POLLIN}, {fd=45, events=POLLIN}], 16, {tv_sec=0, tv_nsec=0},
NULL, 8) = 0 (Timeout) <0.000009>
0.000043 ppoll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7,
events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=11, events
=POLLIN}, {fd=16, events=POLLIN}, {fd=32, events=POLLIN}, {fd=34,
events=POLLIN}, {fd=39, events=POLLIN}, {fd=40, events=POLLIN}, {fd=41,
events=POLLI N}, {fd=42, events=POLLIN}, {fd=43, events=POLLIN},
{fd=44, events=POLLIN}, {fd=45, events=POLLIN}], 16, {tv_sec=0,
tv_nsec=769662}, NULL, 8) = 0 (Tim eout) <0.000788>
In the monitor, "info irq" shows IRQ 0 is increasing about 1,000 times
a second. IRQ 0 seems to be for the system clock, and 1,000 times a
second seems to be the frequency a windows 7 guest might have the
clock at.
Those fd's are for: (9) [eventfd]; [signalfd], type=STREAM, 4 x the
spice socket file, and "TCP localhost:ftnmtp->localhost:36566
(ESTABLISHED)".
Because the guest's registers aren't changing, it seems to me like
monitor thinks the VM is running, but it's actually effectively in a
paused state. I think all the strace activity shown above must be
generated by the host. Perhaps it's repeatedly trying to contact the
guest to inject a new clock, and communicate with it on the various
eventfd's, spice socket, etc. So, I'm thinking the strace doesn't
give any information about the real reason why the VM is acting as if
it's paused.
I've checked "info block", and there's nothing showing that a device
is paused, or that there's any issues with them. (Can't remember what
term can be there, but a paused/blocked/etc block device I think
caused a VM to act like this for me in the past.)
Is there something I can provide to help fix the bug here?
Is there something I can do, to try to get the VM running again? (I
sadly have unsaved work in it.)
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1905562/+subscriptions
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Bug 1905562] Re: Guest seems suspended after host freed memory for it using oom-killer,
Launchpad Bug Tracker <=