qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCH 09/18] Introduce event-tap.


From: ya su
Subject: [Qemu-devel] Re: [PATCH 09/18] Introduce event-tap.
Date: Wed, 9 Mar 2011 16:36:29 +0800

Yoshi:

    I meet one problem if I killed a ft source VM, the dest ft VM will
return errors as the following:

qemu-system-x86_64: fill buffer failed, Resource temporarily unavailable
qemu-system-x86_64: recv header failed

    the problem is that the dest VM can not continue to run, as it is
interrupted in the middle of a transaction, some of rams have been
updated, but the others not, do you have any plan for rolling back to
cancel the interrupted transaction? thanks.


Green.



2011/3/9 Yoshiaki Tamura <address@hidden>:
> ya su wrote:
>>
>> Yoshi:
>>
>>     I think event-tap is a great idea, it remove the reading from disk
>> which will increase ft effiency much better as your plan in later
>> series.
>>
>>     one question: IO read/write may dirty rams, but it is difficute to
>> differ them from other dirty pages like caused by  running of
>> softwares,  whether that means you need change all the emulated device
>> realization?  actually I think it will not send too much rams caused
>> by IO Read/Write in ram_save_live, but if It can event-tap IO
>> read/write and replay on the other side, Does that means we don't need
>> call qemu_savevm_state_full in ft transactoins?
>
> I'm not expecting to remove qemu_savevm_state_full in the transaction.  Just
> reduce the number of pages to be transfered as a result.
>
> Thanks,
>
> Yoshi
>
>>
>> Green.
>>
>>
>> 2011/3/9 Yoshiaki Tamura<address@hidden>:
>>>
>>> ya su wrote:
>>>>
>>>> 2011/3/8 Yoshiaki Tamura<address@hidden>:
>>>>>
>>>>> ya su wrote:
>>>>>>
>>>>>> Yokshiaki:
>>>>>>
>>>>>>     event-tap record block and io wirte events, and replay these on
>>>>>> the other side, so block_save_live is useless during the latter ft
>>>>>> phase, right? if so, I think it need to process the following code in
>>>>>> block_save_live function:
>>>>>
>>>>> Actually no.  It just replays the last events only.  We do have patches
>>>>> that
>>>>> enable block replication without using block live migration, like the
>>>>> way
>>>>> you described above.  In that case, we disable block live migration
>>>>> when
>>>>>  we
>>>>> go into ft mode.  We're thinking to propose it after this series get
>>>>> settled.
>>>>
>>>> so event-tap's objective is to initial a ft transaction, to start the
>>>> sync. of ram/block/device states? if so, it need not change
>>>> bdrv_aio_writev/bdrv_aio_flush normal process, on the other side it
>>>> need not invokde bdrv_aio_writev either, right?
>>>
>>> Mostly yes, but because event-tap is queuing requests from block/net, it
>>> needs to flush queued requests after the transaction on the primary side.
>>>  On the secondary, it currently doesn't have to invoke bdrv_aio_writev as
>>> you mentioned.  But will change soon to enable block replication with
>>> event-tap.
>>>
>>>>
>>>>>
>>>>>>
>>>>>>     if (stage == 1) {
>>>>>>         init_blk_migration(mon, f);
>>>>>>
>>>>>>         /* start track dirty blocks */
>>>>>>         set_dirty_tracking(1);
>>>>>>     }
>>>>>> --------------------------------------
>>>>>> the following code will send block to the other side, as this will
>>>>>> also be done by event-tap replay. I think it should placed in stage 3,
>>>>>> before the assert line. (this may affect some stage 2 rate-limit
>>>>>> then, so this can be placed in stage 2, though it looks ugly), another
>>>>>> choice is to avoid the invocation of block_save_live, right?
>>>>>> ---------------------------------------
>>>>>>     flush_blks(f);
>>>>>>
>>>>>>     if (qemu_file_has_error(f)) {
>>>>>>         blk_mig_cleanup(mon);
>>>>>>         return 0;
>>>>>>     }
>>>>>>
>>>>>>     blk_mig_reset_dirty_cursor();
>>>>>> ----------------------------------------
>>>>>>     if (stage == 2) {
>>>>>>
>>>>>>
>>>>>>     another question is: since you event-tap io write(I think IO READ
>>>>>> should also be event-tapped, as read may cause io chip state to
>>>>>> change),  you then need not invoke qemu_savevm_state_full in
>>>>>> qemu_savevm_trans_complete, right? thanks.
>>>>>
>>>>> It's not necessary to tap IO READ, but you can if you like.  We also
>>>>> have
>>>>> experimental patches for this to reduce rams to be transfered.  But I
>>>>> don't
>>>>> understand why we don't have to invoke qemu_savevm_state_full although
>>>>> I
>>>>> think we may reduce number of rams by replaying IO READ on the
>>>>> secondary.
>>>>>
>>>>
>>>> I first think the objective of io-Write event-tap is to reproduce the
>>>> same device state on the other side, though I doubt this,  so I think
>>>> IO-Read also should be recorded and replayed. since event-tap is only
>>>> to initial a ft transaction, the sync. of states still depend on
>>>> qemu_save_vm_live/full,  I understand the design now, thanks.
>>>>
>>>> but I don't understand why io-write event-tap can reduce transfered
>>>> rams as you mentioned, the amount of rams only depend on dirty pages,
>>>> IO write don't change the normal process unlike block write, right?
>>>
>>> The point is, if we can assure that IO read retrieves the same data on
>>> both
>>> sides, instead of dirtying the ram by read, meaning we have to transfer
>>> in
>>> the transaction, just replay the operation and get the same data on the
>>> otherside. Anyway, that's just a plan :)
>>>
>>> Thanks,
>>>
>>> Yoshi
>>>
>>>>
>>>>> Thanks,
>>>>>
>>>>> Yoshi
>>>>>
>>>>>>
>>>>>>
>>>>>> Green.
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2011/2/24 Yoshiaki Tamura<address@hidden>:
>>>>>>>
>>>>>>> event-tap controls when to start FT transaction, and provides proxy
>>>>>>> functions to called from net/block devices.  While FT transaction, it
>>>>>>> queues up net/block requests, and flush them when the transaction
>>>>>>> gets
>>>>>>> completed.
>>>>>>>
>>>>>>> Signed-off-by: Yoshiaki Tamura<address@hidden>
>>>>>>> Signed-off-by: OHMURA Kei<address@hidden>
>>>>>>> ---
>>>>>>>  Makefile.target |    1 +
>>>>>>>  event-tap.c     |  940
>>>>>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>  event-tap.h     |   44 +++
>>>>>>>  qemu-tool.c     |   28 ++
>>>>>>>  trace-events    |   10 +
>>>>>>>  5 files changed, 1023 insertions(+), 0 deletions(-)
>>>>>>>  create mode 100644 event-tap.c
>>>>>>>  create mode 100644 event-tap.h
>>>>>>>
>>>>>>> diff --git a/Makefile.target b/Makefile.target
>>>>>>> index 220589e..da57efe 100644
>>>>>>> --- a/Makefile.target
>>>>>>> +++ b/Makefile.target
>>>>>>> @@ -199,6 +199,7 @@ obj-y += rwhandler.o
>>>>>>>  obj-$(CONFIG_KVM) += kvm.o kvm-all.o
>>>>>>>  obj-$(CONFIG_NO_KVM) += kvm-stub.o
>>>>>>>  LIBS+=-lz
>>>>>>> +obj-y += event-tap.o
>>>>>>>
>>>>>>>  QEMU_CFLAGS += $(VNC_TLS_CFLAGS)
>>>>>>>  QEMU_CFLAGS += $(VNC_SASL_CFLAGS)
>>>>>>> diff --git a/event-tap.c b/event-tap.c
>>>>>>> new file mode 100644
>>>>>>> index 0000000..95c147a
>>>>>>> --- /dev/null
>>>>>>> +++ b/event-tap.c
>>>>>>> @@ -0,0 +1,940 @@
>>>>>>> +/*
>>>>>>> + * Event Tap functions for QEMU
>>>>>>> + *
>>>>>>> + * Copyright (c) 2010 Nippon Telegraph and Telephone Corporation.
>>>>>>> + *
>>>>>>> + * This work is licensed under the terms of the GNU GPL, version 2.
>>>>>>>  See
>>>>>>> + * the COPYING file in the top-level directory.
>>>>>>> + */
>>>>>>> +
>>>>>>> +#include "qemu-common.h"
>>>>>>> +#include "qemu-error.h"
>>>>>>> +#include "block.h"
>>>>>>> +#include "block_int.h"
>>>>>>> +#include "ioport.h"
>>>>>>> +#include "osdep.h"
>>>>>>> +#include "sysemu.h"
>>>>>>> +#include "hw/hw.h"
>>>>>>> +#include "net.h"
>>>>>>> +#include "event-tap.h"
>>>>>>> +#include "trace.h"
>>>>>>> +
>>>>>>> +enum EVENT_TAP_STATE {
>>>>>>> +    EVENT_TAP_OFF,
>>>>>>> +    EVENT_TAP_ON,
>>>>>>> +    EVENT_TAP_SUSPEND,
>>>>>>> +    EVENT_TAP_FLUSH,
>>>>>>> +    EVENT_TAP_LOAD,
>>>>>>> +    EVENT_TAP_REPLAY,
>>>>>>> +};
>>>>>>> +
>>>>>>> +static enum EVENT_TAP_STATE event_tap_state = EVENT_TAP_OFF;
>>>>>>> +
>>>>>>> +typedef struct EventTapIOport {
>>>>>>> +    uint32_t address;
>>>>>>> +    uint32_t data;
>>>>>>> +    int      index;
>>>>>>> +} EventTapIOport;
>>>>>>> +
>>>>>>> +#define MMIO_BUF_SIZE 8
>>>>>>> +
>>>>>>> +typedef struct EventTapMMIO {
>>>>>>> +    uint64_t address;
>>>>>>> +    uint8_t  buf[MMIO_BUF_SIZE];
>>>>>>> +    int      len;
>>>>>>> +} EventTapMMIO;
>>>>>>> +
>>>>>>> +typedef struct EventTapNetReq {
>>>>>>> +    char *device_name;
>>>>>>> +    int iovcnt;
>>>>>>> +    int vlan_id;
>>>>>>> +    bool vlan_needed;
>>>>>>> +    bool async;
>>>>>>> +    struct iovec *iov;
>>>>>>> +    NetPacketSent *sent_cb;
>>>>>>> +} EventTapNetReq;
>>>>>>> +
>>>>>>> +#define MAX_BLOCK_REQUEST 32
>>>>>>> +
>>>>>>> +typedef struct EventTapAIOCB EventTapAIOCB;
>>>>>>> +
>>>>>>> +typedef struct EventTapBlkReq {
>>>>>>> +    char *device_name;
>>>>>>> +    int num_reqs;
>>>>>>> +    int num_cbs;
>>>>>>> +    bool is_flush;
>>>>>>> +    BlockRequest reqs[MAX_BLOCK_REQUEST];
>>>>>>> +    EventTapAIOCB *acb[MAX_BLOCK_REQUEST];
>>>>>>> +} EventTapBlkReq;
>>>>>>> +
>>>>>>> +#define EVENT_TAP_IOPORT (1<<      0)
>>>>>>> +#define EVENT_TAP_MMIO   (1<<      1)
>>>>>>> +#define EVENT_TAP_NET    (1<<      2)
>>>>>>> +#define EVENT_TAP_BLK    (1<<      3)
>>>>>>> +
>>>>>>> +#define EVENT_TAP_TYPE_MASK (EVENT_TAP_NET - 1)
>>>>>>> +
>>>>>>> +typedef struct EventTapLog {
>>>>>>> +    int mode;
>>>>>>> +    union {
>>>>>>> +        EventTapIOport ioport;
>>>>>>> +        EventTapMMIO mmio;
>>>>>>> +    };
>>>>>>> +    union {
>>>>>>> +        EventTapNetReq net_req;
>>>>>>> +        EventTapBlkReq blk_req;
>>>>>>> +    };
>>>>>>> +    QTAILQ_ENTRY(EventTapLog) node;
>>>>>>> +} EventTapLog;
>>>>>>> +
>>>>>>> +struct EventTapAIOCB {
>>>>>>> +    BlockDriverAIOCB common;
>>>>>>> +    BlockDriverAIOCB *acb;
>>>>>>> +    bool is_canceled;
>>>>>>> +};
>>>>>>> +
>>>>>>> +static EventTapLog *last_event_tap;
>>>>>>> +
>>>>>>> +static QTAILQ_HEAD(, EventTapLog) event_list;
>>>>>>> +static QTAILQ_HEAD(, EventTapLog) event_pool;
>>>>>>> +
>>>>>>> +static int (*event_tap_cb)(void);
>>>>>>> +static QEMUBH *event_tap_bh;
>>>>>>> +static VMChangeStateEntry *vmstate;
>>>>>>> +
>>>>>>> +static void event_tap_bh_cb(void *p)
>>>>>>> +{
>>>>>>> +    if (event_tap_cb) {
>>>>>>> +        event_tap_cb();
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    qemu_bh_delete(event_tap_bh);
>>>>>>> +    event_tap_bh = NULL;
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void event_tap_schedule_bh(void)
>>>>>>> +{
>>>>>>> +    trace_event_tap_ignore_bh(!!event_tap_bh);
>>>>>>> +
>>>>>>> +    /* if bh is already set, we ignore it for now */
>>>>>>> +    if (event_tap_bh) {
>>>>>>> +        return;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    event_tap_bh = qemu_bh_new(event_tap_bh_cb, NULL);
>>>>>>> +    qemu_bh_schedule(event_tap_bh);
>>>>>>> +
>>>>>>> +    return;
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void *event_tap_alloc_log(void)
>>>>>>> +{
>>>>>>> +    EventTapLog *log;
>>>>>>> +
>>>>>>> +    if (QTAILQ_EMPTY(&event_pool)) {
>>>>>>> +        log = qemu_mallocz(sizeof(EventTapLog));
>>>>>>> +    } else {
>>>>>>> +        log = QTAILQ_FIRST(&event_pool);
>>>>>>> +        QTAILQ_REMOVE(&event_pool, log, node);
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    return log;
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void event_tap_free_net_req(EventTapNetReq *net_req);
>>>>>>> +static void event_tap_free_blk_req(EventTapBlkReq *blk_req);
>>>>>>> +
>>>>>>> +static void event_tap_free_log(EventTapLog *log)
>>>>>>> +{
>>>>>>> +    int mode = log->mode&      ~EVENT_TAP_TYPE_MASK;
>>>>>>> +
>>>>>>> +    if (mode == EVENT_TAP_NET) {
>>>>>>> +        event_tap_free_net_req(&log->net_req);
>>>>>>> +    } else if (mode == EVENT_TAP_BLK) {
>>>>>>> +        event_tap_free_blk_req(&log->blk_req);
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    log->mode = 0;
>>>>>>> +
>>>>>>> +    /* return the log to event_pool */
>>>>>>> +    QTAILQ_INSERT_HEAD(&event_pool, log, node);
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void event_tap_free_pool(void)
>>>>>>> +{
>>>>>>> +    EventTapLog *log, *next;
>>>>>>> +
>>>>>>> +    QTAILQ_FOREACH_SAFE(log,&event_pool, node, next) {
>>>>>>> +        QTAILQ_REMOVE(&event_pool, log, node);
>>>>>>> +        qemu_free(log);
>>>>>>> +    }
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void event_tap_free_net_req(EventTapNetReq *net_req)
>>>>>>> +{
>>>>>>> +    int i;
>>>>>>> +
>>>>>>> +    if (!net_req->async) {
>>>>>>> +        for (i = 0; i<      net_req->iovcnt; i++) {
>>>>>>> +            qemu_free(net_req->iov[i].iov_base);
>>>>>>> +        }
>>>>>>> +        qemu_free(net_req->iov);
>>>>>>> +    } else if (event_tap_state>= EVENT_TAP_LOAD) {
>>>>>>> +        qemu_free(net_req->iov);
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    qemu_free(net_req->device_name);
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void event_tap_alloc_net_req(EventTapNetReq *net_req,
>>>>>>> +                                   VLANClientState *vc,
>>>>>>> +                                   const struct iovec *iov, int
>>>>>>> iovcnt,
>>>>>>> +                                   NetPacketSent *sent_cb, bool
>>>>>>> async)
>>>>>>> +{
>>>>>>> +    int i;
>>>>>>> +
>>>>>>> +    net_req->iovcnt = iovcnt;
>>>>>>> +    net_req->async = async;
>>>>>>> +    net_req->device_name = qemu_strdup(vc->name);
>>>>>>> +    net_req->sent_cb = sent_cb;
>>>>>>> +
>>>>>>> +    if (vc->vlan) {
>>>>>>> +        net_req->vlan_needed = 1;
>>>>>>> +        net_req->vlan_id = vc->vlan->id;
>>>>>>> +    } else {
>>>>>>> +        net_req->vlan_needed = 0;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    if (async) {
>>>>>>> +        net_req->iov = (struct iovec *)iov;
>>>>>>> +    } else {
>>>>>>> +        net_req->iov = qemu_malloc(sizeof(struct iovec) * iovcnt);
>>>>>>> +        for (i = 0; i<      iovcnt; i++) {
>>>>>>> +            net_req->iov[i].iov_base = qemu_malloc(iov[i].iov_len);
>>>>>>> +            memcpy(net_req->iov[i].iov_base, iov[i].iov_base,
>>>>>>> iov[i].iov_len);
>>>>>>> +            net_req->iov[i].iov_len = iov[i].iov_len;
>>>>>>> +        }
>>>>>>> +    }
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void event_tap_packet(VLANClientState *vc, const struct iovec
>>>>>>> *iov,
>>>>>>> +                            int iovcnt, NetPacketSent *sent_cb, bool
>>>>>>> async)
>>>>>>> +{
>>>>>>> +    int empty;
>>>>>>> +    EventTapLog *log = last_event_tap;
>>>>>>> +
>>>>>>> +    if (!log) {
>>>>>>> +        trace_event_tap_no_event();
>>>>>>> +        log = event_tap_alloc_log();
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    if (log->mode&      ~EVENT_TAP_TYPE_MASK) {
>>>>>>> +        trace_event_tap_already_used(log->mode&
>>>>>>>  ~EVENT_TAP_TYPE_MASK);
>>>>>>> +        return;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    log->mode |= EVENT_TAP_NET;
>>>>>>> +    event_tap_alloc_net_req(&log->net_req, vc, iov, iovcnt, sent_cb,
>>>>>>> async);
>>>>>>> +
>>>>>>> +    empty = QTAILQ_EMPTY(&event_list);
>>>>>>> +    QTAILQ_INSERT_TAIL(&event_list, log, node);
>>>>>>> +    last_event_tap = NULL;
>>>>>>> +
>>>>>>> +    if (empty) {
>>>>>>> +        event_tap_schedule_bh();
>>>>>>> +    }
>>>>>>> +}
>>>>>>> +
>>>>>>> +void event_tap_send_packet(VLANClientState *vc, const uint8_t *buf,
>>>>>>> int
>>>>>>> size)
>>>>>>> +{
>>>>>>> +    struct iovec iov;
>>>>>>> +
>>>>>>> +    assert(event_tap_state == EVENT_TAP_ON);
>>>>>>> +
>>>>>>> +    iov.iov_base = (uint8_t *)buf;
>>>>>>> +    iov.iov_len = size;
>>>>>>> +    event_tap_packet(vc,&iov, 1, NULL, 0);
>>>>>>> +
>>>>>>> +    return;
>>>>>>> +}
>>>>>>> +
>>>>>>> +ssize_t event_tap_sendv_packet_async(VLANClientState *vc,
>>>>>>> +                                     const struct iovec *iov,
>>>>>>> +                                     int iovcnt, NetPacketSent
>>>>>>> *sent_cb)
>>>>>>> +{
>>>>>>> +    assert(event_tap_state == EVENT_TAP_ON);
>>>>>>> +    event_tap_packet(vc, iov, iovcnt, sent_cb, 1);
>>>>>>> +    return 0;
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void event_tap_net_flush(EventTapNetReq *net_req)
>>>>>>> +{
>>>>>>> +    VLANClientState *vc;
>>>>>>> +    ssize_t len;
>>>>>>> +
>>>>>>> +    if (net_req->vlan_needed) {
>>>>>>> +        vc = qemu_find_vlan_client_by_name(NULL, net_req->vlan_id,
>>>>>>> +                                           net_req->device_name);
>>>>>>> +    } else {
>>>>>>> +        vc = qemu_find_netdev(net_req->device_name);
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    if (net_req->async) {
>>>>>>> +        len = qemu_sendv_packet_async(vc, net_req->iov,
>>>>>>> net_req->iovcnt,
>>>>>>> +                                      net_req->sent_cb);
>>>>>>> +        if (len) {
>>>>>>> +            net_req->sent_cb(vc, len);
>>>>>>> +        } else {
>>>>>>> +            /* packets are queued in the net layer */
>>>>>>> +            trace_event_tap_append_packet();
>>>>>>> +        }
>>>>>>> +    } else {
>>>>>>> +        qemu_send_packet(vc, net_req->iov[0].iov_base,
>>>>>>> +                         net_req->iov[0].iov_len);
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    /* force flush to avoid request inversion */
>>>>>>> +    qemu_aio_flush();
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void event_tap_net_save(QEMUFile *f, EventTapNetReq *net_req)
>>>>>>> +{
>>>>>>> +    ram_addr_t page_addr;
>>>>>>> +    int i, len;
>>>>>>> +
>>>>>>> +    len = strlen(net_req->device_name);
>>>>>>> +    qemu_put_byte(f, len);
>>>>>>> +    qemu_put_buffer(f, (uint8_t *)net_req->device_name, len);
>>>>>>> +    qemu_put_byte(f, net_req->vlan_id);
>>>>>>> +    qemu_put_byte(f, net_req->vlan_needed);
>>>>>>> +    qemu_put_byte(f, net_req->async);
>>>>>>> +    qemu_put_be32(f, net_req->iovcnt);
>>>>>>> +
>>>>>>> +    for (i = 0; i<      net_req->iovcnt; i++) {
>>>>>>> +        qemu_put_be64(f, net_req->iov[i].iov_len);
>>>>>>> +        if (net_req->async) {
>>>>>>> +            page_addr =
>>>>>>> +
>>>>>>>  qemu_ram_addr_from_host_nofail(net_req->iov[i].iov_base);
>>>>>>> +            qemu_put_be64(f, page_addr);
>>>>>>> +        } else {
>>>>>>> +            qemu_put_buffer(f, (uint8_t *)net_req->iov[i].iov_base,
>>>>>>> +                            net_req->iov[i].iov_len);
>>>>>>> +        }
>>>>>>> +    }
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void event_tap_net_load(QEMUFile *f, EventTapNetReq *net_req)
>>>>>>> +{
>>>>>>> +    ram_addr_t page_addr;
>>>>>>> +    int i, len;
>>>>>>> +
>>>>>>> +    len = qemu_get_byte(f);
>>>>>>> +    net_req->device_name = qemu_malloc(len + 1);
>>>>>>> +    qemu_get_buffer(f, (uint8_t *)net_req->device_name, len);
>>>>>>> +    net_req->device_name[len] = '\0';
>>>>>>> +    net_req->vlan_id = qemu_get_byte(f);
>>>>>>> +    net_req->vlan_needed = qemu_get_byte(f);
>>>>>>> +    net_req->async = qemu_get_byte(f);
>>>>>>> +    net_req->iovcnt = qemu_get_be32(f);
>>>>>>> +    net_req->iov = qemu_malloc(sizeof(struct iovec) *
>>>>>>> net_req->iovcnt);
>>>>>>> +
>>>>>>> +    for (i = 0; i<      net_req->iovcnt; i++) {
>>>>>>> +        net_req->iov[i].iov_len = qemu_get_be64(f);
>>>>>>> +        if (net_req->async) {
>>>>>>> +            page_addr = qemu_get_be64(f);
>>>>>>> +            net_req->iov[i].iov_base = qemu_get_ram_ptr(page_addr);
>>>>>>> +        } else {
>>>>>>> +            net_req->iov[i].iov_base =
>>>>>>> qemu_malloc(net_req->iov[i].iov_len);
>>>>>>> +            qemu_get_buffer(f, (uint8_t *)net_req->iov[i].iov_base,
>>>>>>> +                            net_req->iov[i].iov_len);
>>>>>>> +        }
>>>>>>> +    }
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void event_tap_free_blk_req(EventTapBlkReq *blk_req)
>>>>>>> +{
>>>>>>> +    int i;
>>>>>>> +
>>>>>>> +    if (event_tap_state>= EVENT_TAP_LOAD&&      !blk_req->is_flush)
>>>>>>> {
>>>>>>> +        for (i = 0; i<      blk_req->num_reqs; i++) {
>>>>>>> +            qemu_iovec_destroy(blk_req->reqs[i].qiov);
>>>>>>> +            qemu_free(blk_req->reqs[i].qiov);
>>>>>>> +        }
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    qemu_free(blk_req->device_name);
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void event_tap_blk_cb(void *opaque, int ret)
>>>>>>> +{
>>>>>>> +    EventTapLog *log = container_of(opaque, EventTapLog, blk_req);
>>>>>>> +    EventTapBlkReq *blk_req = opaque;
>>>>>>> +    int i;
>>>>>>> +
>>>>>>> +    blk_req->num_cbs--;
>>>>>>> +
>>>>>>> +    /* all outstanding requests are flushed */
>>>>>>> +    if (blk_req->num_cbs == 0) {
>>>>>>> +        for (i = 0; i<      blk_req->num_reqs; i++) {
>>>>>>> +            EventTapAIOCB *eacb = blk_req->acb[i];
>>>>>>> +            eacb->common.cb(eacb->common.opaque, ret);
>>>>>>> +            qemu_aio_release(eacb);
>>>>>>> +        }
>>>>>>> +
>>>>>>> +        event_tap_free_log(log);
>>>>>>> +    }
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void event_tap_bdrv_aio_cancel(BlockDriverAIOCB *acb)
>>>>>>> +{
>>>>>>> +    EventTapAIOCB *eacb = container_of(acb, EventTapAIOCB, common);
>>>>>>> +
>>>>>>> +    /* check if already passed to block layer */
>>>>>>> +    if (eacb->acb) {
>>>>>>> +        bdrv_aio_cancel(eacb->acb);
>>>>>>> +    } else {
>>>>>>> +        eacb->is_canceled = 1;
>>>>>>> +    }
>>>>>>> +}
>>>>>>> +
>>>>>>> +static AIOPool event_tap_aio_pool = {
>>>>>>> +    .aiocb_size = sizeof(EventTapAIOCB),
>>>>>>> +    .cancel     = event_tap_bdrv_aio_cancel,
>>>>>>> +};
>>>>>>> +
>>>>>>> +static void event_tap_alloc_blk_req(EventTapBlkReq *blk_req,
>>>>>>> +                                    BlockDriverState *bs,
>>>>>>> BlockRequest
>>>>>>> *reqs,
>>>>>>> +                                    int num_reqs, void *opaque, bool
>>>>>>> is_flush)
>>>>>>> +{
>>>>>>> +    int i;
>>>>>>> +
>>>>>>> +    blk_req->num_reqs = num_reqs;
>>>>>>> +    blk_req->num_cbs = num_reqs;
>>>>>>> +    blk_req->device_name = qemu_strdup(bs->device_name);
>>>>>>> +    blk_req->is_flush = is_flush;
>>>>>>> +
>>>>>>> +    for (i = 0; i<      num_reqs; i++) {
>>>>>>> +        blk_req->reqs[i].sector = reqs[i].sector;
>>>>>>> +        blk_req->reqs[i].nb_sectors = reqs[i].nb_sectors;
>>>>>>> +        blk_req->reqs[i].qiov = reqs[i].qiov;
>>>>>>> +        blk_req->reqs[i].cb = event_tap_blk_cb;
>>>>>>> +        blk_req->reqs[i].opaque = opaque;
>>>>>>> +
>>>>>>> +        blk_req->acb[i] = qemu_aio_get(&event_tap_aio_pool, bs,
>>>>>>> +                                       reqs[i].cb, reqs[i].opaque);
>>>>>>> +    }
>>>>>>> +}
>>>>>>> +
>>>>>>> +static EventTapBlkReq *event_tap_bdrv(BlockDriverState *bs,
>>>>>>> BlockRequest
>>>>>>> *reqs,
>>>>>>> +                                      int num_reqs, bool is_flush)
>>>>>>> +{
>>>>>>> +    EventTapLog *log = last_event_tap;
>>>>>>> +    int empty;
>>>>>>> +
>>>>>>> +    if (!log) {
>>>>>>> +        trace_event_tap_no_event();
>>>>>>> +        log = event_tap_alloc_log();
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    if (log->mode&      ~EVENT_TAP_TYPE_MASK) {
>>>>>>> +        trace_event_tap_already_used(log->mode&
>>>>>>>  ~EVENT_TAP_TYPE_MASK);
>>>>>>> +        return NULL;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    log->mode |= EVENT_TAP_BLK;
>>>>>>> +    event_tap_alloc_blk_req(&log->blk_req, bs, reqs,
>>>>>>> +                            num_reqs,&log->blk_req, is_flush);
>>>>>>> +
>>>>>>> +    empty = QTAILQ_EMPTY(&event_list);
>>>>>>> +    QTAILQ_INSERT_TAIL(&event_list, log, node);
>>>>>>> +    last_event_tap = NULL;
>>>>>>> +
>>>>>>> +    if (empty) {
>>>>>>> +        event_tap_schedule_bh();
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    return&log->blk_req;
>>>>>>> +}
>>>>>>> +
>>>>>>> +BlockDriverAIOCB *event_tap_bdrv_aio_writev(BlockDriverState *bs,
>>>>>>> +                                            int64_t sector_num,
>>>>>>> +                                            QEMUIOVector *iov,
>>>>>>> +                                            int nb_sectors,
>>>>>>> +
>>>>>>>  BlockDriverCompletionFunc
>>>>>>> *cb,
>>>>>>> +                                            void *opaque)
>>>>>>> +{
>>>>>>> +    BlockRequest req;
>>>>>>> +    EventTapBlkReq *ereq;
>>>>>>> +
>>>>>>> +    assert(event_tap_state == EVENT_TAP_ON);
>>>>>>> +
>>>>>>> +    req.sector = sector_num;
>>>>>>> +    req.nb_sectors = nb_sectors;
>>>>>>> +    req.qiov = iov;
>>>>>>> +    req.cb = cb;
>>>>>>> +    req.opaque = opaque;
>>>>>>> +    ereq = event_tap_bdrv(bs,&req, 1, 0);
>>>>>>> +
>>>>>>> +    return&ereq->acb[0]->common;
>>>>>>> +}
>>>>>>> +
>>>>>>> +BlockDriverAIOCB *event_tap_bdrv_aio_flush(BlockDriverState *bs,
>>>>>>> +                                           BlockDriverCompletionFunc
>>>>>>> *cb,
>>>>>>> +                                           void *opaque)
>>>>>>> +{
>>>>>>> +    BlockRequest req;
>>>>>>> +    EventTapBlkReq *ereq;
>>>>>>> +
>>>>>>> +    assert(event_tap_state == EVENT_TAP_ON);
>>>>>>> +
>>>>>>> +    memset(&req, 0, sizeof(req));
>>>>>>> +    req.cb = cb;
>>>>>>> +    req.opaque = opaque;
>>>>>>> +    ereq = event_tap_bdrv(bs,&req, 1, 1);
>>>>>>> +
>>>>>>> +    return&ereq->acb[0]->common;
>>>>>>> +}
>>>>>>> +
>>>>>>> +void event_tap_bdrv_flush(void)
>>>>>>> +{
>>>>>>> +    qemu_bh_cancel(event_tap_bh);
>>>>>>> +
>>>>>>> +    while (!QTAILQ_EMPTY(&event_list)) {
>>>>>>> +        event_tap_cb();
>>>>>>> +    }
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void event_tap_blk_flush(EventTapBlkReq *blk_req)
>>>>>>> +{
>>>>>>> +    int i, ret;
>>>>>>> +
>>>>>>> +    for (i = 0; i<      blk_req->num_reqs; i++) {
>>>>>>> +        BlockRequest *req =&blk_req->reqs[i];
>>>>>>> +        EventTapAIOCB *eacb = blk_req->acb[i];
>>>>>>> +        BlockDriverAIOCB *acb =&eacb->common;
>>>>>>> +
>>>>>>> +        /* don't flush if canceled */
>>>>>>> +        if (eacb->is_canceled) {
>>>>>>> +            continue;
>>>>>>> +        }
>>>>>>> +
>>>>>>> +        /* receiver needs to restore bs from device name */
>>>>>>> +        if (!acb->bs) {
>>>>>>> +            acb->bs = bdrv_find(blk_req->device_name);
>>>>>>> +        }
>>>>>>> +
>>>>>>> +        if (blk_req->is_flush) {
>>>>>>> +            eacb->acb = bdrv_aio_flush(acb->bs, req->cb,
>>>>>>> req->opaque);
>>>>>>> +            if (!eacb->acb) {
>>>>>>> +                req->cb(req->opaque, -EIO);
>>>>>>> +            }
>>>>>>> +            return;
>>>>>>> +        }
>>>>>>> +
>>>>>>> +        eacb->acb = bdrv_aio_writev(acb->bs, req->sector, req->qiov,
>>>>>>> +                                    req->nb_sectors, req->cb,
>>>>>>> req->opaque);
>>>>>>> +        if (!eacb->acb) {
>>>>>>> +            req->cb(req->opaque, -EIO);
>>>>>>> +        }
>>>>>>> +
>>>>>>> +        /* force flush to avoid request inversion */
>>>>>>> +        qemu_aio_flush();
>>>>>>> +        ret = bdrv_flush(acb->bs);
>>>>>>> +        if (ret<      0) {
>>>>>>> +            error_report("flushing blk_req to %s failed",
>>>>>>> blk_req->device_name);
>>>>>>> +        }
>>>>>>> +    }
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void event_tap_blk_save(QEMUFile *f, EventTapBlkReq *blk_req)
>>>>>>> +{
>>>>>>> +    ram_addr_t page_addr;
>>>>>>> +    int i, j, len;
>>>>>>> +
>>>>>>> +    len = strlen(blk_req->device_name);
>>>>>>> +    qemu_put_byte(f, len);
>>>>>>> +    qemu_put_buffer(f, (uint8_t *)blk_req->device_name, len);
>>>>>>> +    qemu_put_byte(f, blk_req->num_reqs);
>>>>>>> +    qemu_put_byte(f, blk_req->is_flush);
>>>>>>> +
>>>>>>> +    if (blk_req->is_flush) {
>>>>>>> +        return;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    for (i = 0; i<      blk_req->num_reqs; i++) {
>>>>>>> +        BlockRequest *req =&blk_req->reqs[i];
>>>>>>> +        EventTapAIOCB *eacb = blk_req->acb[i];
>>>>>>> +        /* don't save canceled requests */
>>>>>>> +        if (eacb->is_canceled) {
>>>>>>> +            continue;
>>>>>>> +        }
>>>>>>> +        qemu_put_be64(f, req->sector);
>>>>>>> +        qemu_put_be32(f, req->nb_sectors);
>>>>>>> +        qemu_put_be32(f, req->qiov->niov);
>>>>>>> +
>>>>>>> +        for (j = 0; j<      req->qiov->niov; j++) {
>>>>>>> +            page_addr =
>>>>>>> +
>>>>>>>  qemu_ram_addr_from_host_nofail(req->qiov->iov[j].iov_base);
>>>>>>> +            qemu_put_be64(f, page_addr);
>>>>>>> +            qemu_put_be64(f, req->qiov->iov[j].iov_len);
>>>>>>> +        }
>>>>>>> +    }
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void event_tap_blk_load(QEMUFile *f, EventTapBlkReq *blk_req)
>>>>>>> +{
>>>>>>> +    BlockRequest *req;
>>>>>>> +    ram_addr_t page_addr;
>>>>>>> +    int i, j, len, niov;
>>>>>>> +
>>>>>>> +    len = qemu_get_byte(f);
>>>>>>> +    blk_req->device_name = qemu_malloc(len + 1);
>>>>>>> +    qemu_get_buffer(f, (uint8_t *)blk_req->device_name, len);
>>>>>>> +    blk_req->device_name[len] = '\0';
>>>>>>> +    blk_req->num_reqs = qemu_get_byte(f);
>>>>>>> +    blk_req->is_flush = qemu_get_byte(f);
>>>>>>> +
>>>>>>> +    if (blk_req->is_flush) {
>>>>>>> +        return;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    for (i = 0; i<      blk_req->num_reqs; i++) {
>>>>>>> +        req =&blk_req->reqs[i];
>>>>>>> +        req->sector = qemu_get_be64(f);
>>>>>>> +        req->nb_sectors = qemu_get_be32(f);
>>>>>>> +        req->qiov = qemu_mallocz(sizeof(QEMUIOVector));
>>>>>>> +        niov = qemu_get_be32(f);
>>>>>>> +        qemu_iovec_init(req->qiov, niov);
>>>>>>> +
>>>>>>> +        for (j = 0; j<      niov; j++) {
>>>>>>> +            void *iov_base;
>>>>>>> +            size_t iov_len;
>>>>>>> +            page_addr = qemu_get_be64(f);
>>>>>>> +            iov_base = qemu_get_ram_ptr(page_addr);
>>>>>>> +            iov_len = qemu_get_be64(f);
>>>>>>> +            qemu_iovec_add(req->qiov, iov_base, iov_len);
>>>>>>> +        }
>>>>>>> +    }
>>>>>>> +}
>>>>>>> +
>>>>>>> +void event_tap_ioport(int index, uint32_t address, uint32_t data)
>>>>>>> +{
>>>>>>> +    if (event_tap_state != EVENT_TAP_ON) {
>>>>>>> +        return;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    if (!last_event_tap) {
>>>>>>> +        last_event_tap = event_tap_alloc_log();
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    last_event_tap->mode = EVENT_TAP_IOPORT;
>>>>>>> +    last_event_tap->ioport.index = index;
>>>>>>> +    last_event_tap->ioport.address = address;
>>>>>>> +    last_event_tap->ioport.data = data;
>>>>>>> +}
>>>>>>> +
>>>>>>> +static inline void event_tap_ioport_save(QEMUFile *f, EventTapIOport
>>>>>>> *ioport)
>>>>>>> +{
>>>>>>> +    qemu_put_be32(f, ioport->index);
>>>>>>> +    qemu_put_be32(f, ioport->address);
>>>>>>> +    qemu_put_byte(f, ioport->data);
>>>>>>> +}
>>>>>>> +
>>>>>>> +static inline void event_tap_ioport_load(QEMUFile *f,
>>>>>>> +                                         EventTapIOport *ioport)
>>>>>>> +{
>>>>>>> +    ioport->index = qemu_get_be32(f);
>>>>>>> +    ioport->address = qemu_get_be32(f);
>>>>>>> +    ioport->data = qemu_get_byte(f);
>>>>>>> +}
>>>>>>> +
>>>>>>> +void event_tap_mmio(uint64_t address, uint8_t *buf, int len)
>>>>>>> +{
>>>>>>> +    if (event_tap_state != EVENT_TAP_ON || len>      MMIO_BUF_SIZE)
>>>>>>> {
>>>>>>> +        return;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    if (!last_event_tap) {
>>>>>>> +        last_event_tap = event_tap_alloc_log();
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    last_event_tap->mode = EVENT_TAP_MMIO;
>>>>>>> +    last_event_tap->mmio.address = address;
>>>>>>> +    last_event_tap->mmio.len = len;
>>>>>>> +    memcpy(last_event_tap->mmio.buf, buf, len);
>>>>>>> +}
>>>>>>> +
>>>>>>> +static inline void event_tap_mmio_save(QEMUFile *f, EventTapMMIO
>>>>>>> *mmio)
>>>>>>> +{
>>>>>>> +    qemu_put_be64(f, mmio->address);
>>>>>>> +    qemu_put_byte(f, mmio->len);
>>>>>>> +    qemu_put_buffer(f, mmio->buf, mmio->len);
>>>>>>> +}
>>>>>>> +
>>>>>>> +static inline void event_tap_mmio_load(QEMUFile *f, EventTapMMIO
>>>>>>> *mmio)
>>>>>>> +{
>>>>>>> +    mmio->address = qemu_get_be64(f);
>>>>>>> +    mmio->len = qemu_get_byte(f);
>>>>>>> +    qemu_get_buffer(f, mmio->buf, mmio->len);
>>>>>>> +}
>>>>>>> +
>>>>>>> +int event_tap_register(int (*cb)(void))
>>>>>>> +{
>>>>>>> +    if (event_tap_state != EVENT_TAP_OFF) {
>>>>>>> +        error_report("event-tap is already on");
>>>>>>> +        return -EINVAL;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    if (!cb || event_tap_cb) {
>>>>>>> +        error_report("can't set event_tap_cb");
>>>>>>> +        return -EINVAL;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    event_tap_cb = cb;
>>>>>>> +    event_tap_state = EVENT_TAP_ON;
>>>>>>> +
>>>>>>> +    return 0;
>>>>>>> +}
>>>>>>> +
>>>>>>> +void event_tap_unregister(void)
>>>>>>> +{
>>>>>>> +    if (event_tap_state == EVENT_TAP_OFF) {
>>>>>>> +        error_report("event-tap is already off");
>>>>>>> +        return;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    qemu_del_vm_change_state_handler(vmstate);
>>>>>>> +
>>>>>>> +    event_tap_flush();
>>>>>>> +    event_tap_free_pool();
>>>>>>> +
>>>>>>> +    event_tap_state = EVENT_TAP_OFF;
>>>>>>> +    event_tap_cb = NULL;
>>>>>>> +}
>>>>>>> +
>>>>>>> +int event_tap_is_on(void)
>>>>>>> +{
>>>>>>> +    return (event_tap_state == EVENT_TAP_ON);
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void event_tap_suspend(void *opaque, int running, int reason)
>>>>>>> +{
>>>>>>> +    event_tap_state = running ? EVENT_TAP_ON : EVENT_TAP_SUSPEND;
>>>>>>> +}
>>>>>>> +
>>>>>>> +/* returns 1 if the queue gets emtpy */
>>>>>>> +int event_tap_flush_one(void)
>>>>>>> +{
>>>>>>> +    EventTapLog *log;
>>>>>>> +    int ret;
>>>>>>> +
>>>>>>> +    if (QTAILQ_EMPTY(&event_list)) {
>>>>>>> +        return 1;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    event_tap_state = EVENT_TAP_FLUSH;
>>>>>>> +
>>>>>>> +    log = QTAILQ_FIRST(&event_list);
>>>>>>> +    QTAILQ_REMOVE(&event_list, log, node);
>>>>>>> +    switch (log->mode&      ~EVENT_TAP_TYPE_MASK) {
>>>>>>> +    case EVENT_TAP_NET:
>>>>>>> +        event_tap_net_flush(&log->net_req);
>>>>>>> +        event_tap_free_log(log);
>>>>>>> +        break;
>>>>>>> +    case EVENT_TAP_BLK:
>>>>>>> +        event_tap_blk_flush(&log->blk_req);
>>>>>>> +        break;
>>>>>>> +    default:
>>>>>>> +        error_report("Unknown state %d", log->mode);
>>>>>>> +        event_tap_free_log(log);
>>>>>>> +        return -EINVAL;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    ret = QTAILQ_EMPTY(&event_list);
>>>>>>> +    event_tap_state = ret ? EVENT_TAP_ON : EVENT_TAP_FLUSH;
>>>>>>> +
>>>>>>> +    return ret;
>>>>>>> +}
>>>>>>> +
>>>>>>> +void event_tap_flush(void)
>>>>>>> +{
>>>>>>> +    int ret;
>>>>>>> +
>>>>>>> +    do {
>>>>>>> +        ret = event_tap_flush_one();
>>>>>>> +    } while (ret == 0);
>>>>>>> +
>>>>>>> +    if (ret<      0) {
>>>>>>> +        error_report("error flushing event-tap requests");
>>>>>>> +        abort();
>>>>>>> +    }
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void event_tap_replay(void *opaque, int running, int reason)
>>>>>>> +{
>>>>>>> +    EventTapLog *log, *next;
>>>>>>> +
>>>>>>> +    if (!running) {
>>>>>>> +        return;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    assert(event_tap_state == EVENT_TAP_LOAD);
>>>>>>> +
>>>>>>> +    event_tap_state = EVENT_TAP_REPLAY;
>>>>>>> +
>>>>>>> +    QTAILQ_FOREACH(log,&event_list, node) {
>>>>>>> +        if ((log->mode&      ~EVENT_TAP_TYPE_MASK) == EVENT_TAP_NET)
>>>>>>> {
>>>>>>> +            EventTapNetReq *net_req =&log->net_req;
>>>>>>> +            if (!net_req->async) {
>>>>>>> +                event_tap_net_flush(net_req);
>>>>>>> +                continue;
>>>>>>> +            }
>>>>>>> +        }
>>>>>>> +
>>>>>>> +        switch (log->mode&      EVENT_TAP_TYPE_MASK) {
>>>>>>> +        case EVENT_TAP_IOPORT:
>>>>>>> +            switch (log->ioport.index) {
>>>>>>> +            case 0:
>>>>>>> +                cpu_outb(log->ioport.address, log->ioport.data);
>>>>>>> +                break;
>>>>>>> +            case 1:
>>>>>>> +                cpu_outw(log->ioport.address, log->ioport.data);
>>>>>>> +                break;
>>>>>>> +            case 2:
>>>>>>> +                cpu_outl(log->ioport.address, log->ioport.data);
>>>>>>> +                break;
>>>>>>> +            }
>>>>>>> +            break;
>>>>>>> +        case EVENT_TAP_MMIO:
>>>>>>> +            cpu_physical_memory_rw(log->mmio.address,
>>>>>>> +                                   log->mmio.buf,
>>>>>>> +                                   log->mmio.len, 1);
>>>>>>> +            break;
>>>>>>> +        case 0:
>>>>>>> +            trace_event_tap_replay_no_event();
>>>>>>> +            break;
>>>>>>> +        default:
>>>>>>> +            error_report("Unknown state %d", log->mode);
>>>>>>> +            QTAILQ_REMOVE(&event_list, log, node);
>>>>>>> +            event_tap_free_log(log);
>>>>>>> +            return;
>>>>>>> +        }
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    /* remove event logs from queue */
>>>>>>> +    QTAILQ_FOREACH_SAFE(log,&event_list, node, next) {
>>>>>>> +        QTAILQ_REMOVE(&event_list, log, node);
>>>>>>> +        event_tap_free_log(log);
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    event_tap_state = EVENT_TAP_OFF;
>>>>>>> +    qemu_del_vm_change_state_handler(vmstate);
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void event_tap_save(QEMUFile *f, void *opaque)
>>>>>>> +{
>>>>>>> +    EventTapLog *log;
>>>>>>> +
>>>>>>> +    QTAILQ_FOREACH(log,&event_list, node) {
>>>>>>> +        qemu_put_byte(f, log->mode);
>>>>>>> +
>>>>>>> +        switch (log->mode&      EVENT_TAP_TYPE_MASK) {
>>>>>>> +        case EVENT_TAP_IOPORT:
>>>>>>> +            event_tap_ioport_save(f,&log->ioport);
>>>>>>> +            break;
>>>>>>> +        case EVENT_TAP_MMIO:
>>>>>>> +            event_tap_mmio_save(f,&log->mmio);
>>>>>>> +            break;
>>>>>>> +        case 0:
>>>>>>> +            trace_event_tap_save_no_event();
>>>>>>> +            break;
>>>>>>> +        default:
>>>>>>> +            error_report("Unknown state %d", log->mode);
>>>>>>> +            return;
>>>>>>> +        }
>>>>>>> +
>>>>>>> +        switch (log->mode&      ~EVENT_TAP_TYPE_MASK) {
>>>>>>> +        case EVENT_TAP_NET:
>>>>>>> +            event_tap_net_save(f,&log->net_req);
>>>>>>> +            break;
>>>>>>> +        case EVENT_TAP_BLK:
>>>>>>> +            event_tap_blk_save(f,&log->blk_req);
>>>>>>> +            break;
>>>>>>> +        default:
>>>>>>> +            error_report("Unknown state %d", log->mode);
>>>>>>> +            return;
>>>>>>> +        }
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    qemu_put_byte(f, 0); /* EOF */
>>>>>>> +}
>>>>>>> +
>>>>>>> +static int event_tap_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>> +{
>>>>>>> +    EventTapLog *log, *next;
>>>>>>> +    int mode;
>>>>>>> +
>>>>>>> +    event_tap_state = EVENT_TAP_LOAD;
>>>>>>> +
>>>>>>> +    QTAILQ_FOREACH_SAFE(log,&event_list, node, next) {
>>>>>>> +        QTAILQ_REMOVE(&event_list, log, node);
>>>>>>> +        event_tap_free_log(log);
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    /* loop until EOF */
>>>>>>> +    while ((mode = qemu_get_byte(f)) != 0) {
>>>>>>> +        EventTapLog *log = event_tap_alloc_log();
>>>>>>> +
>>>>>>> +        log->mode = mode;
>>>>>>> +        switch (log->mode&      EVENT_TAP_TYPE_MASK) {
>>>>>>> +        case EVENT_TAP_IOPORT:
>>>>>>> +            event_tap_ioport_load(f,&log->ioport);
>>>>>>> +            break;
>>>>>>> +        case EVENT_TAP_MMIO:
>>>>>>> +            event_tap_mmio_load(f,&log->mmio);
>>>>>>> +            break;
>>>>>>> +        case 0:
>>>>>>> +            trace_event_tap_load_no_event();
>>>>>>> +            break;
>>>>>>> +        default:
>>>>>>> +            error_report("Unknown state %d", log->mode);
>>>>>>> +            event_tap_free_log(log);
>>>>>>> +            return -EINVAL;
>>>>>>> +        }
>>>>>>> +
>>>>>>> +        switch (log->mode&      ~EVENT_TAP_TYPE_MASK) {
>>>>>>> +        case EVENT_TAP_NET:
>>>>>>> +            event_tap_net_load(f,&log->net_req);
>>>>>>> +            break;
>>>>>>> +        case EVENT_TAP_BLK:
>>>>>>> +            event_tap_blk_load(f,&log->blk_req);
>>>>>>> +            break;
>>>>>>> +        default:
>>>>>>> +            error_report("Unknown state %d", log->mode);
>>>>>>> +            event_tap_free_log(log);
>>>>>>> +            return -EINVAL;
>>>>>>> +        }
>>>>>>> +
>>>>>>> +        QTAILQ_INSERT_TAIL(&event_list, log, node);
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    return 0;
>>>>>>> +}
>>>>>>> +
>>>>>>> +void event_tap_schedule_replay(void)
>>>>>>> +{
>>>>>>> +    vmstate = qemu_add_vm_change_state_handler(event_tap_replay,
>>>>>>> NULL);
>>>>>>> +}
>>>>>>> +
>>>>>>> +void event_tap_schedule_suspend(void)
>>>>>>> +{
>>>>>>> +    vmstate = qemu_add_vm_change_state_handler(event_tap_suspend,
>>>>>>> NULL);
>>>>>>> +}
>>>>>>> +
>>>>>>> +void event_tap_init(void)
>>>>>>> +{
>>>>>>> +    QTAILQ_INIT(&event_list);
>>>>>>> +    QTAILQ_INIT(&event_pool);
>>>>>>> +    register_savevm(NULL, "event-tap", 0, 1,
>>>>>>> +                    event_tap_save, event_tap_load,&last_event_tap);
>>>>>>> +}
>>>>>>> diff --git a/event-tap.h b/event-tap.h
>>>>>>> new file mode 100644
>>>>>>> index 0000000..ab677f8
>>>>>>> --- /dev/null
>>>>>>> +++ b/event-tap.h
>>>>>>> @@ -0,0 +1,44 @@
>>>>>>> +/*
>>>>>>> + * Event Tap functions for QEMU
>>>>>>> + *
>>>>>>> + * Copyright (c) 2010 Nippon Telegraph and Telephone Corporation.
>>>>>>> + *
>>>>>>> + * This work is licensed under the terms of the GNU GPL, version 2.
>>>>>>>  See
>>>>>>> + * the COPYING file in the top-level directory.
>>>>>>> + */
>>>>>>> +
>>>>>>> +#ifndef EVENT_TAP_H
>>>>>>> +#define EVENT_TAP_H
>>>>>>> +
>>>>>>> +#include "qemu-common.h"
>>>>>>> +#include "net.h"
>>>>>>> +#include "block.h"
>>>>>>> +
>>>>>>> +int event_tap_register(int (*cb)(void));
>>>>>>> +void event_tap_unregister(void);
>>>>>>> +int event_tap_is_on(void);
>>>>>>> +void event_tap_schedule_suspend(void);
>>>>>>> +void event_tap_ioport(int index, uint32_t address, uint32_t data);
>>>>>>> +void event_tap_mmio(uint64_t address, uint8_t *buf, int len);
>>>>>>> +void event_tap_init(void);
>>>>>>> +void event_tap_flush(void);
>>>>>>> +int event_tap_flush_one(void);
>>>>>>> +void event_tap_schedule_replay(void);
>>>>>>> +
>>>>>>> +void event_tap_send_packet(VLANClientState *vc, const uint8_t *buf,
>>>>>>> int
>>>>>>> size);
>>>>>>> +ssize_t event_tap_sendv_packet_async(VLANClientState *vc,
>>>>>>> +                                     const struct iovec *iov,
>>>>>>> +                                     int iovcnt, NetPacketSent
>>>>>>> *sent_cb);
>>>>>>> +
>>>>>>> +BlockDriverAIOCB *event_tap_bdrv_aio_writev(BlockDriverState *bs,
>>>>>>> +                                            int64_t sector_num,
>>>>>>> +                                            QEMUIOVector *iov,
>>>>>>> +                                            int nb_sectors,
>>>>>>> +
>>>>>>>  BlockDriverCompletionFunc
>>>>>>> *cb,
>>>>>>> +                                            void *opaque);
>>>>>>> +BlockDriverAIOCB *event_tap_bdrv_aio_flush(BlockDriverState *bs,
>>>>>>> +                                           BlockDriverCompletionFunc
>>>>>>> *cb,
>>>>>>> +                                           void *opaque);
>>>>>>> +void event_tap_bdrv_flush(void);
>>>>>>> +
>>>>>>> +#endif
>>>>>>> diff --git a/qemu-tool.c b/qemu-tool.c
>>>>>>> index 392e1c9..3f71215 100644
>>>>>>> --- a/qemu-tool.c
>>>>>>> +++ b/qemu-tool.c
>>>>>>> @@ -16,6 +16,7 @@
>>>>>>>  #include "qemu-timer.h"
>>>>>>>  #include "qemu-log.h"
>>>>>>>  #include "sysemu.h"
>>>>>>> +#include "event-tap.h"
>>>>>>>
>>>>>>>  #include<sys/time.h>
>>>>>>>
>>>>>>> @@ -111,3 +112,30 @@ int qemu_set_fd_handler2(int fd,
>>>>>>>  {
>>>>>>>     return 0;
>>>>>>>  }
>>>>>>> +
>>>>>>> +BlockDriverAIOCB *event_tap_bdrv_aio_writev(BlockDriverState *bs,
>>>>>>> +                                            int64_t sector_num,
>>>>>>> +                                            QEMUIOVector *iov,
>>>>>>> +                                            int nb_sectors,
>>>>>>> +
>>>>>>>  BlockDriverCompletionFunc
>>>>>>> *cb,
>>>>>>> +                                            void *opaque)
>>>>>>> +{
>>>>>>> +    return NULL;
>>>>>>> +}
>>>>>>> +
>>>>>>> +BlockDriverAIOCB *event_tap_bdrv_aio_flush(BlockDriverState *bs,
>>>>>>> +                                           BlockDriverCompletionFunc
>>>>>>> *cb,
>>>>>>> +                                           void *opaque)
>>>>>>> +{
>>>>>>> +    return NULL;
>>>>>>> +}
>>>>>>> +
>>>>>>> +void event_tap_bdrv_flush(void)
>>>>>>> +{
>>>>>>> +}
>>>>>>> +
>>>>>>> +int event_tap_is_on(void)
>>>>>>> +{
>>>>>>> +    return 0;
>>>>>>> +}
>>>>>>> +
>>>>>>> diff --git a/trace-events b/trace-events
>>>>>>> index 50ac840..1af3895 100644
>>>>>>> --- a/trace-events
>>>>>>> +++ b/trace-events
>>>>>>> @@ -269,3 +269,13 @@ disable ft_trans_freeze_input(void) "backend not
>>>>>>> ready, freezing input"
>>>>>>>  disable ft_trans_put_ready(void) "file is ready to put"
>>>>>>>  disable ft_trans_get_ready(void) "file is ready to get"
>>>>>>>  disable ft_trans_cb(void *cb) "callback %p"
>>>>>>> +
>>>>>>> +# event-tap.c
>>>>>>> +disable event_tap_ignore_bh(int bh) "event_tap_bh is already
>>>>>>> scheduled
>>>>>>> %d"
>>>>>>> +disable event_tap_net_cb(char *s, ssize_t len) "%s: %zd bytes packet
>>>>>>> was
>>>>>>> sended"
>>>>>>> +disable event_tap_no_event(void) "no last_event_tap"
>>>>>>> +disable event_tap_already_used(int mode) "last_event_tap already
>>>>>>> used
>>>>>>> %d"
>>>>>>> +disable event_tap_append_packet(void) "This packet is appended"
>>>>>>> +disable event_tap_replay_no_event(void) "No event to replay"
>>>>>>> +disable event_tap_save_no_event(void) "No event to save"
>>>>>>> +disable event_tap_load_no_event(void) "No event to load"
>>>>>>> --
>>>>>>> 1.7.1.2
>>>>>>>
>>>>>>> --
>>>>>>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>>>>>>> the body of a message to address@hidden
>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]