qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Using cache=writeback safely on qemu 1.4.0 and later


From: Andrew Martin
Subject: Re: [Qemu-devel] Using cache=writeback safely on qemu 1.4.0 and later
Date: Tue, 19 Aug 2014 18:20:38 -0500 (CDT)

----- Original Message -----
> From: "Stefan Hajnoczi" <address@hidden>
> To: "Andrew Martin" <address@hidden>
> Cc: address@hidden
> Sent: Tuesday, August 19, 2014 9:59:25 AM
> Subject: Re: [Qemu-devel] Using cache=writeback safely on qemu 1.4.0 and later
> 
> If you strace -f the QEMU process on the host, you will see fdatasync(2)
> system calls when the guest flushes the disk.
> 
> You can find the file descriptor number by checking ls -l
> /proc/$PID_OF_QEMU/fd and looking for the disk image file.

When the disk is set to cache=writethrough on one of the same VMs, I see 
frequent 
fdatasync(2) calls (every few seconds). However, when I change the disk over to
cache=writeback, since boot I have not yet seen a single fdatasync(2) call, even
after writing data 2x the amount of RAM:
# time strace -ft -p4113 2>&1 | grep fdatasync
^C

real    15m39.245s
user    0m7.940s
sys     0m18.280s

Note that the disk is defined as follows:
<disk type='file' device='disk'>
        <driver name='qemu' type='qcow2' cache='writeback'/>
        <source file='/var/lib/libvirt/images/vm.img'/>
        <target dev='vda' bus='virtio'/>
        <alias name='virtio-disk0'/>
        <address type='pci' domain='0x0000' bus='0x00' slot='0x04' 
function='0x0'/>
</disk>


> > I recently experienced UPS failure on several hosts which caused a hard
> > shutdown. After restarting, 3 of the guests had corruption on their disks
> > and
> > required a fairly long fsck to fix. Afterwards, data that had been written
> > to
> > the disks several hours before the crash was corrupted, which makes me
> > think
> > that it was never fsync()-ed to the non-volatile storage.
> 
> What exactly was the "corruption" you encountered?  Which application,
> error message, etc.

Two of the servers are web servers with apache2. In one case, a python daemon
copies JPGs onto the server - the last 100 copied onto the server were 
corrupted. 
In another case, some files had been uploaded several days prior to the 
www-root, 
but after the hard reset said files were no longer present in the filesystem. 


> > Is it safe in this setup to use cache=writeback? Or, should I use
> > cache=writethrough instead?
> 
> Ubuntu 12.04 is recent and sends write cache flushes.
> 
> Are you sure the file system and/or application workload are flushing
> the disk cache?  Please check the mount options and application-specific
> configuration.

The mount options for the ext4 filesystem in the VM in both cases are:
rw,relatime,errors=remount-ro,data=ordered

Similarly, the host's ext4 filesystem holding the images is mounted with:
rw,relatime,data=ordered

I did not see any errors in the kernel log in the guest, probably because the 
root filesystem was read-only until the fsck had completed.

Thanks,

Andrew



reply via email to

[Prev in Thread] Current Thread [Next in Thread]