qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Adding a persistent writeback cache to qemu


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] Adding a persistent writeback cache to qemu
Date: Thu, 20 Jun 2013 11:46:18 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

On Wed, Jun 19, 2013 at 10:28:53PM +0100, Alex Bligh wrote:
> --On 11 April 2013 11:25:48 +0200 Stefan Hajnoczi
> <address@hidden> wrote:
> 
> >>I'd like to experiment with adding persistent writeback cache to qemu.
> >>The use case here is where non-local storage is used (e.g. rbd, ceph)
> >>using the qemu drivers, together with a local cache as a file on
> >>a much faster locally mounted device, for instance an SSD (possibly
> >>replicated). This would I think give a similar performance boost to
> >>using an rbd block device plus flashcache/dm-cache/bcache, but without
> >>introducing all the context switches and limitations of having to
> >>use real block devices. I appreciate it would need to be live migration
> >>aware (worst case solution: flush and turn off caching during live
> >>migrate), and ideally be capable of replaying a dirty writeback cache
> >>in the event the host crashes.
> >>
> >>Is there any support for this already? Has anyone worked on this before?
> >>If not, would there be any interest in it?
> >
> >I'm concerned about the complexity this would introduce in QEMU.
> >Therefore I'm a fan of using existing solutions like the Linux block
> >layer instead of reimplementing this stuff in Linux.
> >
> >What concrete issues are there with using rbd plus
> >flashcache/dm-cache/bcache?
> >
> >I'm not sure I understand the context switch problem since implementing
> >it in user space will still require system calls to do all the actual
> >cache I/O.
> 
> I failed to see your reply and got distracted from this. Apologies.
> So several months later ...

Happens to me sometimes too ;-).

> The concrete problem here is that flashcache/dm-cache/bcache don't
> work with the rbd (librbd) driver, as flashcache/dm-cache/bcache
> cache access to block devices (in the host layer), and with rbd
> (for instance) there is no access to a block device at all. block/rbd.c
> simply calls librbd which calls librados etc.
> 
> So the context switches etc. I am avoiding are the ones that would
> be introduced by using kernel rbd devices rather than librbd.

I understand the limitations with kernel block devices - their
setup/teardown is an extra step outside QEMU and privileges need to be
managed.  That basically means you need to use a management tool like
libvirt to make it usable.

But I don't understand the performance angle here.  Do you have profiles
that show kernel rbd is a bottleneck due to context switching?

We use the kernel page cache for -drive file=test.img,cache=writeback
and no one has suggested reimplementing the page cache inside QEMU for
better performance.

Also, how do you want to manage QEMU page cache with multiple guests
running?  They are independent and know nothing about each other.  Their
process memory consumption will be bloated and the kernel memory
management will end up having to sort out who gets to stay in physical
memory.

You can see I'm skeptical of this and think it's premature optimization,
but if there's really a case for it with performance profiles then I
guess it would be necessary.  But we should definitely get feedback from
the Ceph folks too.

I'd like to hear from Ceph folks what their position on kernel rbd vs
librados is.  Why one do they recommend for QEMU guests and what are the
pros/cons?

CCed Sage and Josh

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]