Guest OS becomes totally unresponsive when running on VMware

From: Salvatore Mazzarino
Date: Sat, 30 Jan 2021 14:28:08 +0100

I'm running QEMU v4.2.0 inside a Docker container. The Docker container runs on a VMware VM running Flatcar Linux OS with kernel version 5.4.92. It happens that the QEMU process starts to show up an high usage of CPU. The CPUs that the Qemu process uses go into saturation and the guest OS running by QEMU becomes totally unresponsive. Note that this issue does not occur when QEMU runs on a bare metal machine. It only happens when we have nested virtualization.

I would like to debug the QEMU process to get where it get stuck. Then I compiled a version with tracing enabled. Following the tracing doc I was able to get traces of QEMU.

Problem here is that I'm not sure exactly what function to trace. If I trace all the file is huge. In 5 seconds of tracing in fact I get over 2GB. I would need some guidance to narrow down the issue finding the right QEMU trace to get some useful info. Which function would you recommend to trace?

