qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC] Replication agent design (was [RFC PATCH] replica


From: Ori Mamluk
Subject: Re: [Qemu-devel] [RFC] Replication agent design (was [RFC PATCH] replication agent module)
Date: Sun, 19 Feb 2012 15:40:38 +0200
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:9.0) Gecko/20111222 Thunderbird/9.0.1

On 08/02/2012 16:59, Stefan Hajnoczi wrote:
On Wed, Feb 8, 2012 at 1:28 PM, Ori Mamluk<address@hidden>  wrote:
You mentioned a future feature that sends request metadata (offset,
length) to the rephub synchronously so that protection is 100%.
(Otherwise a network failure or crash might result in missed writes
that the rephub does not know about.)

The NBD tap might not be the right channel for sending synchronous
request metadata, since the protocol is geared towards block I/O
requests that include the actual data.  I'm not sure that QMP should
be used either - even though we have the concept of QMP events -
because it's not a low-latency, high ops communications channel.

Which channel do you use in your existing products for synchronous
request metadata?

Stefan

Looking a little deeper into the NBD solution, it has another problematic angle. Assuming Rhev is managing the system - it will need to allocate a port per volume on the host.
I don't see a clean way to do it.
Also, the idea of opening 3 process-external APIs for the replication (NBD client, NBD server, meta-data tap) doesn't feel right to me.

Going back to Anthony's older mail :
We're doomed to reinvent all of the Linux storage layer it seems. I think we really only have two choices: make better use of kernel facilities for this (like drbd) or have a proper, pluggable, storage interface so that QEMU proper doesn't have to deal with all of this.

Gluster is appealing as a pluggable storage interface although the license is problematic for us today.

I'm quite confident that we shouldn't be in the business of replicating storage though. If the answer is NBD++, that's fine too.

I think it might be better to go back to my original less generic design.
We can regard it as a 'plugin' for a specific application - in this case, replication. I can add a plugin interface in the generic block layer that allows building a proper storage stack. The plugin will have capabilities like a filter driver - getting hold of the request on its way down (from VM to storage) and on its way up (IO completion), allowing to block or stall both.

As for the plugin mechanism - it's clear to me that a dynamic plugin is out of the question. It can be a definition - for example a 'plugins' directory under block, which will contain the plugins code, and plugged by command line or QMP commands.
This way we create separation between the Qemu code and the storage filters,

The down side is that the plugin code tends to be less generic and reusable. The advantage is that by separating - we don't complicate the Qemu storage stack code with applicative requirements.

How about it?

Ori.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]