qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCHv3] QEMU(upstream): Disable xen's use of O_DIRECT


From: Alex Bligh
Subject: Re: [Qemu-devel] [PATCHv3] QEMU(upstream): Disable xen's use of O_DIRECT by default as it results in crashes.
Date: Mon, 18 Mar 2013 14:30:41 +0000

Paolo,

--On 18 March 2013 15:05:08 +0100 Paolo Bonzini <address@hidden> wrote:

Presumably the same way as if writeback caching is selected. I presume
that must fsync() / fdatasync() all the data to disk, and a barrier will
produce one of those.

No, that's done already.  The source does an fsync/fdatasync before
terminating the migration.

The problem is that the target's page cache might host image data from a
previous run.  If you do not use O_DIRECT, it will not see the changes
made on the source.

I was under the impression that with cache=writeback, qemu doesn't
use O_DIRECT, in which case why isn't there the same danger under
kvm, i.e. that the target page cache contains data from a previous
run?

It would be great to fix the kernel bug (and I have submitted code), but
the fix is pretty intrusive (see the link I posted) and there appears
to be little interest in taking it forward. Certainly my kernel hacking
skills are not adequate to the task.

The current position is that booting a Xen domU which does disk I/O
(Ubuntu cloud image used as the test case) with an NFS root crashes dom0
absolutely repeatably, and kills all other guests. Unless and until
there is a kernel fix for that, Xen is in essence unusable with HVM
and network based disk backend. So we need a workaround in the meantime
which doesn't require a kernel fix.

If you want to have this patch, you need to detect the bug and only do
the hack if the bug is detect.  Plus, disable migration when the hack is
in use.

I originally suggested having this as an option (detecting it live
and non-destructively is practically impossible - suggestions welcome),
but xen-devel felt it should just be changed. My original preference
was for xl to process cache= type options (so those using a local
file system known to be safe could use O_DIRECT still), but that
requires a change to xenstore, was not popular, and is probably too
intrusive. I patched it the way the xendevel folks wanted.

Disabling migration seems a bit excessive when migration isn't disabled
with cache=unsafe (AFAIK), and the alternative (using O_DIRECT)
is far far more unsafe (one tcp retransmit and your system is dead).

1) why does blkback not have the bug?

2) does it also affect virtio disks (or perhaps AHCI too)?  I think
Stefano experimented with virtio in Xen.  If it does, then you're
working around the problem in the wrong place.

I believe it affects PV disks and not emulated disks as emulated disks
under Xen do not use O_DIRECT (despite migration apparently working
notwithstanding your comment above).

Stefano did ack the patch, and for a one line change it's been
through a pretty extensive discussion on xen-devel ...

I've no idea what else it affects. I'd suggest it also affects kvm,
save that the kvm 'bad' will be writing the wrong data, not hosing
the whole machine.

--
Alex Bligh



reply via email to

[Prev in Thread] Current Thread [Next in Thread]