|
From: | Ori Mamluk |
Subject: | [Qemu-devel] [RFC PATCH v3 6/9] repagent: Updated documentation in qemu-repagent.txt. Renamed rephub read command to remoteIo. |
Date: | Thu, 5 Apr 2012 15:17:58 +0300 |
Sent 'address@hidden'as repagent patch V2
--- block/repagent/qemu-repagent.txt | 147 +++++++++++++++----------------------- block/repagent/repagent.c | 71 ++++++++++-------- block/repagent/repagent.h | 4 +- block/repagent/repagent_client.c | 4 +- block/repagent/repcmd.h | 4 +- block/repagent/rephub_cmds.h | 27 ++++--- 6 files changed, 119 insertions(+), 138 deletions(-)
diff --git a/block/repagent/qemu-repagent.txt b/block/repagent/qemu-repagent.txt index e3b0c1e..f8def3f 100644 --- a/block/repagent/qemu-repagent.txt +++ b/block/repagent/qemu-repagent.txt @@ -1,104 +1,73 @@ - repagent - replication agent - a Qemu module for enabling continuous async replication of VM volumes + repagent - replication agent - a Qemu module for enabling continuous async replication of VM volumes
Introduction - This document describes a feature in Qemu - a replication agent (AKA Repagent). - The Repagent is a new module that exposes an API to an external replication system (AKA Rephub). - This API allows a Rephub to communicate with a Qemu VM and continuously replicate its volumes. - The imlementation of a Rephub is outside of the scope of this document. There may be several various Rephub - implenetations using the same repagent in Qemu. + This document describes a feature in Qemu - a replication agent (named Repagent). + The Repagent is a new module that exposes an API to an external replication system (AKA Rephub). + This API allows a Rephub to communicate with a Qemu VM and continuously replicate its volumes. + The imlementation of a Rephub is outside of the scope of this document. There may be several various Rephub + implenetations using the same repagent in Qemu. + The Repagent is storage driver that acts like a filter driver. + It can be regarded as a 'plugin' that is activated when the management system enables replication.
Main feature of Repagent - Repagent does the following: - * Report volumes - report a list of all volumes in a VM to the Rephub. - * Report writes to a volume - send all writes made to a protected volume to the Rephub. - The reporting of an IO is asyncronuous - i.e. the IO is not delayed by the Repagent to get any acknowledgement from the Rephub. - It is only copied to the Rephub. - * Read a protected volume - allows the Rephub to read a protected volume, to enable the protected hub to syncronize the content of a protected volume. + Repagent has the following main features: + * Report volumes - report a list of all volumes in a VM to the Rephub. + * Mirror writes - Report writes to a volume - send all writes made to a protected volume to the Rephub. + The reporting of an IO is asyncronuous - i.e. the IO is not delayed by the Repagent to get any acknowledgement from the Rephub. + It is only copied to the Rephub. + * Remote IO - Read/write a volume - allows the Rephub to read a protected volume, to enable the protected hub to syncronize + the content of a protected volume. + Also used to read/write to a recovery volume - the replica of a protected volume.
Description of the Repagent module
Build and run options - New configure option: --enable-replication - New command line option: - -repagent [hub IP/name] - Enable replication support for disks - hub is the ip or name of the machine running the replication hub. + New configure option: --enable-repagent + New command line option: + -repagent [hub IP/name] + Enable replication support for disks + hub is the ip or name of the machine running the replication hub.
Module APIs - The Repagent module interfaces two main components: - 1. The Rephub - An external API based on socket messages - 2. The generic block layer- block.c - - Rephub message API - The external replication API is a message based API. - We won't go into the structure of the messages here - just the sematics. - - Messages list - (The updated list and comments are in Rephub_cmds.h) - - Messages from the Repagent to the Rephub: - * Protected write - The Repagent sends each write to a protected volume to the hub with the IO status. - In case the status is bad the write content is not sent - * Report VM volumes - The agent reports all the volumes of the VM to the hub. - * Read Volume Response - A response to a Read Volume Request - Sends the data read from a protected volume to the hub - * Agent shutdown - Notifies the hub that the agent is about to shutdown. - This allows a graceful shutdown. Any disconnection of an agent without - sending this command will result in a full sync of the VM volumes. - - Messages from the Rephub to the Repagent: - * Start protect - The hub instructs the agent to start protecting a volume. When a volume is protected - all its writes are sent to to the hub. - With this command the hub also assigns a volume ID to the given volume name. - * Read volume request - The hub issues a read IO to a protected volume. - This command is used during sync - when the hub needs to read unsyncronized - sections of a protected volume. - This command is a request, the read data is returned by the read volume response message (see above). - block.c API - The API to the generic block storage layer contains 3 functionalities: - 1. Handle writes to protected volumes - In bdrv_co_do_writev, each write is reported to the Repagent module. - 2. Handle each new volume that registers - In bdrv_open - each new bottom-level block driver that registers is reported. - 2. Read from a volume - Repagent calls bdrv_aio_readv to handle read requests coming from the hub. + The Repagent module interfaces two main components: + 1. The Rephub - An external API based on socket messages + See detailed comments about each message in rephub_cmds.h + 2. The generic block layer- block.c + Repagent is a block driver. Most of the block driver functions are just a pass-through + to the next driver. + Writes are mirrors to the hub for replication + Open function is used for registering each volume in Repagent.
General description of a Rephub - a replication system the repagent connects to - This section describes in high level a sample Rephub - a replication system that uses the repagent API - to replicate disks. - It describes a simple Rephub that comntinuously maintains a mirror of the volumes of a VM. - - Say we have a VM we want to protect - call it PVM, say it has 2 volumes - V1, V2. - Our Rephub is called SingleRephub - a Rephub protecting a single VM. - - Preparations - 1. The user chooses a host to rub SingleRephub - a different host than PVM, call it Host2 - 2. The user creates two volumes on Host2 - same sizes of V1 and V2, call them V1R (V1 recovery) and V2R. - 3. The user runs SingleRephub process on Host2, and gives V1R and V2R as command line arguments. - From now on SingleRephub waits for the protected VM repagent to connect. - 4. The user runs the protected VM PVM - and uses the switch -repagent <Host2 IP>. - - Runtime - 1. The repagent module connects to SingleRephub on startup. - 2. repagent reports V1 and V2 to SingleRephub. - 3. SingleRephub starts to perform an initial synchronization of the protected volumes- - it reads each protected volume (V1 and V2) - using read volume requests - and copies the data into the - recovery volume V1R and V2R. - 4. SingleRephub enters 'protection' mode - each write to the protected volume is sent by the repagent to the Rephub, - and the Rephub performs the write on the matching recovery volume. - - * Note that during stage 3 writes to the protected volumes are not ignored - they're kept in a bitmap, - and will be read again when stage 3 ends, in an interative convergin process. - - This flow continuously maintains an updated recovery volume. - If the protected system is damaged, the user can create a new VM on Host2 with the replicated volumes attached to it. - The new VM is a replica of the protected system. + This section describes in high level a sample Rephub - a replication system that uses the repagent API + to replicate disks. + It describes a simple Rephub that comntinuously maintains a mirror of the volumes of a VM. + + Say we have a VM we want to protect - call it PVM, say it has 2 volumes - V1, V2. + Our Rephub is called SingleRephub - a Rephub protecting a single VM. + + Preparations + 1. The user chooses a host to rub SingleRephub - a different host than PVM, call it Host2 + 2. The user creates two volumes on Host2 - same sizes of V1 and V2, call them V1R (V1 recovery) and V2R. + 3. The user runs SingleRephub process on Host2, and gives V1R and V2R as command line arguments. + From now on SingleRephub waits for the protected VM repagent to connect. + 4. The user runs the protected VM PVM - and uses the switch -repagent <Host2 IP>. + + Runtime + 1. The repagent module connects to SingleRephub on startup. + 2. repagent reports V1 and V2 to SingleRephub. + 3. SingleRephub starts to perform an initial synchronization of the protected volumes- + it reads each protected volume (V1 and V2) - using read volume requests - and copies the data into the + recovery volume V1R and V2R. + 4. SingleRephub enters 'protection' mode - each write to the protected volume is sent by the repagent to the Rephub, + and the Rephub performs the write on the matching recovery volume. + + * Note that during stage 3 writes to the protected volumes are not ignored - they're kept in a bitmap, + and will be read again when stage 3 ends, in an interative convergin process. + + This flow continuously maintains an updated recovery volume. + If the protected system is damaged, the user can create a new VM on Host2 with the replicated volumes attached to it. + The new VM is a replica of the protected system.
diff --git a/block/repagent/repagent.c b/block/repagent/repagent.c index c3dd593..bdc0117 100644 --- a/block/repagent/repagent.c +++ b/block/repagent/repagent.c @@ -29,7 +29,7 @@ struct RepAgentState {
typedef struct RepagentReadVolIo { QEMUIOVector qiov; - RepCmdReadVolReq rep_cmd; + RepCmdRemoteIoReq rep_cmd; uint8_t *buf; struct timeval start_time; } RepagentReadVolIo; @@ -38,7 +38,7 @@ static int repagent_get_volume_by_driver( BlockDriverState *bs); static int repagent_get_volume_by_name(const char *name); static void repagent_report_volumes_to_hub(void); -static void repagent_vol_read_done(void *opaque, int ret); +static void repagent_remote_io_done(void *opaque, int ret); static struct timeval tsub(struct timeval t1, struct timeval t2);
RepAgentState g_rep_agent = { 0 }; @@ -242,15 +242,15 @@ static int repagent_get_volume_by_id(uint64_t vol_id) return -1; }
-int repaget_read_vol(RepCmdReadVolReq *pcmd, uint8_t *pdata) +int repagent_remote_io(RepCmdRemoteIoReq *pcmd, uint8_t *pdata) { int index = repagent_get_volume_by_id(pcmd->volume_id); int size_bytes = pcmd->size_sectors * 512; if (index < 0) { printf("Vol read - Could not find vol id %llx\n", (unsigned long long int) pcmd->volume_id); - RepCmdReadVolRes *p_res_cmd = (RepCmdReadVolRes *) repcmd_new( - REPHUB_CMD_READ_VOL_RES, 0, NULL); + RepCmdRemoteIoRes *p_res_cmd = (RepCmdRemoteIoRes *) repcmd_new( + REPHUB_CMD_REMOTE_IO_RES, 0, NULL); p_res_cmd->req_id = pcmd->req_id; p_res_cmd->volume_id = pcmd->volume_id; p_res_cmd->io_status = -1; @@ -264,58 +264,67 @@ int repaget_read_vol(RepCmdReadVolReq *pcmd, uint8_t *pdata) (unsigned long long int) pcmd->offset_sectors, pcmd->size_sectors);
{ - RepagentReadVolIo *read_xact = calloc(1, sizeof(RepagentReadVolIo)); + RepagentReadVolIo *io_xaction = calloc(1, sizeof(RepagentReadVolIo));
/* BlockDriverAIOCB *acb; */
- ZERO_MEM_OBJ(read_xact); + ZERO_MEM_OBJ(io_xaction);
- qemu_iovec_init(&read_xact->qiov, 1); + qemu_iovec_init(&io_xaction->qiov, 1);
/*read_xact->buf = qemu_blockalign(g_rep_agent.volumes[index]->driver_ptr, size_bytes); */ - read_xact->buf = (uint8_t *) g_malloc(size_bytes); - read_xact->rep_cmd = *pcmd; - qemu_iovec_add(&read_xact->qiov, read_xact->buf, size_bytes); + io_xaction->buf = (uint8_t *) g_malloc(size_bytes); + io_xaction->rep_cmd = *pcmd; + qemu_iovec_add(&io_xaction->qiov, io_xaction->buf, size_bytes);
- gettimeofday(&read_xact->start_time, NULL); + gettimeofday(&io_xaction->start_time, NULL); /* orim TODO - use the returned acb to cancel the request on shutdown */ - /*acb = */bdrv_aio_readv(g_rep_agent.volumes[index]->driver_ptr, - read_xact->rep_cmd.offset_sectors, &read_xact->qiov, - read_xact->rep_cmd.size_sectors, repagent_vol_read_done, - read_xact); + /*acb = */ + if (pcmd->is_read) { + bdrv_aio_readv(g_rep_agent.volumes[index]->driver_ptr, + io_xaction->rep_cmd.offset_sectors, &io_xaction->qiov, + io_xaction->rep_cmd.size_sectors, repagent_remote_io_done, + io_xaction); + } else { + bdrv_aio_writev(g_rep_agent.volumes[index]->driver_ptr, + io_xaction->rep_cmd.offset_sectors, &io_xaction->qiov, + io_xaction->rep_cmd.size_sectors, repagent_remote_io_done, + io_xaction); + } }
return TRUE; }
-static void repagent_vol_read_done(void *opaque, int ret) +static void repagent_remote_io_done(void *opaque, int ret) { struct timeval t2; - RepagentReadVolIo *read_xact = (RepagentReadVolIo *) opaque; + RepagentReadVolIo *io_xaction = (RepagentReadVolIo *) opaque; uint8_t *pdata = NULL; - RepCmdReadVolRes *pcmd = (RepCmdReadVolRes *) repcmd_new( - REPHUB_CMD_READ_VOL_RES, read_xact->rep_cmd.size_sectors * 512, + RepCmdRemoteIoRes *pcmd = (RepCmdRemoteIoRes *) repcmd_new( + REPHUB_CMD_REMOTE_IO_RES, io_xaction->rep_cmd.size_sectors * 512, &pdata); - pcmd->req_id = read_xact->rep_cmd.req_id; - pcmd->volume_id = read_xact->rep_cmd.volume_id; + pcmd->req_id = io_xaction->rep_cmd.req_id; + pcmd->volume_id = io_xaction->rep_cmd.volume_id; pcmd->io_status = -1;
- printf("Protected vol read - volId %llu, offset %llu, size %u\n", - (unsigned long long int) read_xact->rep_cmd.volume_id, - (unsigned long long int) read_xact->rep_cmd.offset_sectors, - read_xact->rep_cmd.size_sectors); + printf("Remote IO request - volId %llu, offset %llu, size %u, is_read %u\n", + (unsigned long long int) io_xaction->rep_cmd.volume_id, + (unsigned long long int) io_xaction->rep_cmd.offset_sectors, + io_xaction->rep_cmd.size_sectors, + io_xaction->rep_cmd.is_read); gettimeofday(&t2, NULL);
if (ret >= 0) { /* Read response - send the data to the hub */ - t2 = tsub(t2, read_xact->start_time); - printf("Read prot vol done. Took %u seconds, %u us.", + t2 = tsub(t2, io_xaction->start_time); + printf("Remote IO done. Took %u seconds, %u us.", (uint32_t) t2.tv_sec, (uint32_t) t2.tv_usec);
pcmd->io_status = 0; /* Success */ /* orim todo optimize - don't copy, use the qiov buffer */ - qemu_iovec_to_buffer(&read_xact->qiov, pdata); + qemu_iovec_to_buffer(&io_xaction->qiov, pdata); } else { printf("readv failed: %s\n", strerror(-ret)); } @@ -323,9 +332,9 @@ static void repagent_vol_read_done(void *opaque, int ret) repagent_client_send((RepCmd *) pcmd);
/*qemu_vfree(read_xact->buf); */ - g_free(read_xact->buf); + g_free(io_xaction->buf);
- g_free(read_xact); + g_free(io_xaction); }
static struct timeval tsub(struct timeval t1, struct timeval t2) diff --git a/block/repagent/repagent.h b/block/repagent/repagent.h index 310db0f..0f69820 100644 --- a/block/repagent/repagent.h +++ b/block/repagent/repagent.h @@ -30,7 +30,7 @@ typedef struct RepAgentState RepAgentState; typedef struct RepCmdStartProtect RepCmdStartProtect; typedef struct RepCmdDataStartProtect RepCmdDataStartProtect; -struct RepCmdReadVolReq; +struct RepCmdRemoteIoReq;
/* orim temporary */ extern int use_repagent; @@ -45,7 +45,7 @@ void repagent_deregister_drive(const char *drive_path, BlockDriverState *driver_ptr); int repaget_start_protect(RepCmdStartProtect *pcmd, RepCmdDataStartProtect *pcmd_data); -int repaget_read_vol(struct RepCmdReadVolReq *pcmd, uint8_t *pdata); +int repagent_remote_io(struct RepCmdRemoteIoReq *pcmd, uint8_t *pdata); void repagent_client_connected(void);
diff --git a/block/repagent/repagent_client.c b/block/repagent/repagent_client.c index 9ed8485..ee4aeb7 100644 --- a/block/repagent/repagent_client.c +++ b/block/repagent/repagent_client.c @@ -125,8 +125,8 @@ void repagent_process_cmd(RepCmd *pcmd, uint8_t *pdata, void *clientPtr) (RepCmdDataStartProtect *) pdata); } break; - case REPHUB_CMD_READ_VOL_REQ: { - is_free_data = repaget_read_vol((RepCmdReadVolReq *) pcmd, pdata); + case REPHUB_CMD_REMOTE_IO_REQ: { + is_free_data = repagent_remote_io((RepCmdRemoteIoReq *) pcmd, pdata); } break; default: diff --git a/block/repagent/repcmd.h b/block/repagent/repcmd.h index 8c6cf1b..0a7f297 100644 --- a/block/repagent/repcmd.h +++ b/block/repagent/repcmd.h @@ -36,8 +36,8 @@ enum RepCmds { REPHUB_CMD_PROTECTED_WRITE = 2, REPHUB_CMD_REPORT_VM_VOLUMES = 3, REPHUB_CMD_START_PROTECT = 4, - REPHUB_CMD_READ_VOL_REQ = 5, - REPHUB_CMD_READ_VOL_RES = 6, + REPHUB_CMD_REMOTE_IO_REQ = 5, + REPHUB_CMD_REMOTE_IO_RES = 6, REPHUB_CMD_AGENT_SHUTDOWN = 7, };
diff --git a/block/repagent/rephub_cmds.h b/block/repagent/rephub_cmds.h index d1bad06..cb737e6 100644 --- a/block/repagent/rephub_cmds.h +++ b/block/repagent/rephub_cmds.h @@ -97,40 +97,43 @@ typedef struct RepCmdDataStartProtect {
/********************************************************* - * RepCmd Read Volume Request + * RepCmd Remote IO Request * - * REPHUB_CMD_READ_VOL_REQ + * REPHUB_CMD_REMOTE_IO_REQ * Direction: hub->agent * - * The hub issues a read IO to a protected volume. - * This command is used during sync - when the hub needs - * to read unsyncronized sections of a protected volume. + * The hub issues an IO to a volume. + * This command is used for: + * - Reading protected volume during sync + * - Read/write to a recovery volume during + * protect and failover or failover test * This command is a request, the read data is returned - * by the response command REPHUB_CMD_READ_VOL_RES + * by the response command REPHUB_CMD_REMOTE_IO_RES *********************************************************/ -typedef struct RepCmdReadVolReq { +typedef struct RepCmdRemoteIoReq { RepCmdHdr hdr; int req_id; int size_sectors; uint64_t volume_id; uint64_t offset_sectors; -} RepCmdReadVolReq; + int is_read; +} RepCmdRemoteIoReq;
/********************************************************* * RepCmd Read Volume Response * - * REPHUB_CMD_READ_VOL_RES + * REPHUB_CMD_REMOTE_IO_RES * Direction: agent->hub * - * A response to REPHUB_CMD_READ_VOL_REQ. + * A response to REPHUB_CMD_REMOTE_IO_REQ. * Sends the data read from a protected volume *********************************************************/ -typedef struct RepCmdReadVolRes { +typedef struct RepCmdRemoteIoRes { RepCmdHdr hdr; int req_id; int io_status; uint64_t volume_id; -} RepCmdReadVolRes; +} RepCmdRemoteIoRes;
/********************************************************* * RepCmd Agent shutdown -- 1.7.6.5 |
[Prev in Thread] | Current Thread | [Next in Thread] |