qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] the whole virtual machine hangs when IO does not come b


From: Fam Zheng
Subject: Re: [Qemu-devel] the whole virtual machine hangs when IO does not come back!
Date: Tue, 12 Aug 2014 08:58:53 +0800
User-agent: Mutt/1.5.23 (2014-03-12)

On Mon, 08/11 15:21, Stefan Hajnoczi wrote:
> On Mon, Aug 11, 2014 at 04:33:21PM +0800, Bin Wu wrote:
> > Hi,
> > 
> > I tested the reliability of qemu in the IPSAN environment as follows:
> > (1) create one VM on a X86 server which is connected to an IPSAN, and the VM
> > has only one system volume which is on the IPSAN;
> > (2) disconnect the network between the server and the IPSAN. On the server,
> > I have a "multipath" software which can hold the IO for a long time
> > (configurable) when the network is disconnected;
> > (3) about 30 seconds later, the whole VM hangs there, nothing can be done to
> > the VM!
> > 
> > Then, I used "gstack" tool to collect the stacks of all qemu threads, it
> > looked like:
> > 
> > Thread 8 (Thread 0x7fd840bb5700 (LWP 6671)):
> > #0  0x00007fd84253a4f6 in poll () from /lib64/libc.so.6
> > #1  0x00007fd84410ceff in aio_poll ()
> > #2  0x00007fd84429bb05 in qemu_aio_wait ()
> > #3  0x00007fd844120f51 in bdrv_drain_all ()
> > #4  0x00007fd8441f1a4a in bmdma_cmd_writeb ()
> > #5  0x00007fd8441f216e in bmdma_write ()
> > #6  0x00007fd8443a93cf in memory_region_write_accessor ()
> > #7  0x00007fd8443a94a6 in access_with_adjusted_size ()
> > #8  0x00007fd8443a9901 in memory_region_iorange_write ()
> > #9  0x00007fd8443a19bd in ioport_writeb_thunk ()
> > #10 0x00007fd8443a13a8 in ioport_write ()
> > #11 0x00007fd8443a1f55 in cpu_outb ()
> > #12 0x00007fd8443a5b12 in kvm_handle_io ()
> > #13 0x00007fd8443a64a9 in kvm_cpu_exec ()
> > #14 0x00007fd844330962 in qemu_kvm_cpu_thread_fn ()
> > #15 0x00007fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
> > #16 0x00007fd8425439cd in clone () from /lib64/libc.so.6
> > #17 0x0000000000000000 in ?? ()
> 
> Use virtio-blk.  Read, write, and flush are asynchronous in virtio-blk.
> 
> Note that the QEMU monitor commands are typically synchronous so they
> will still block the VM.
> 

If some of the requests are dropped by host and never return to QEMU, I think
bdrv_drain_all() will still cause the hang. Even with virtio-blk, reset has
such a call. Maybe we could add some -ETIMEDOUT machanism in QEMU's block
layer.

A workaround might be to configure the host storage to fail the IO after a
timeout.

Fam



reply via email to

[Prev in Thread] Current Thread [Next in Thread]