[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] net: Forbid dealing with packets when VM is not
From: |
Dr. David Alan Gilbert |
Subject: |
Re: [Qemu-devel] [PATCH] net: Forbid dealing with packets when VM is not running |
Date: |
Tue, 19 Aug 2014 09:48:37 +0100 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
* zhanghailiang (address@hidden) wrote:
> On 2014/8/18 20:27, Dr. David Alan Gilbert wrote:
> >* zhanghailiang (address@hidden) wrote:
> >>For all NICs(except virtio-net) emulated by qemu,
> >>Such as e1000, rtl8139, pcnet and ne2k_pci,
> >>Qemu can still receive packets when VM is not running.
> >>If this happened in *migration's* last PAUSE VM stage,
> >>The new dirty RAM related to the packets will be missed,
> >>And this will lead serious network fault in VM.
> >
> >I'd like to understand more about when exactly this happens.
> >migration.c:migration_thread in the last step, takes the iothread, puts
> >the runstate into RUN_STATE_FINISH_MIGRATE and only then calls
> >qemu_savevm_state_complete
> >before unlocking the iothread.
> >
>
> Hmm, Sorry, the description above may not be exact.
>
> Actually, the action of receiving packets action happens after the
> migration thread unlock the iothread-lock(when VM is stopped), but
> before the end of the migration(to be exact, before close the socket
> fd, which is used to send data to destination).
>
> I think before close the socket fd, we can not assure all the RAM of the
> VM has been copied to the memory of network card or has been send to
> the destination, we still have the chance to modify its content. (If i
> am wrong, please let me know, Thanks. ;) )
>
> If the above situation happens, it will dirty parts of the RAM which
> will be send and parts of new dirty RAM page may be missed. As a result,
> The contents of memory in the destination is not integral, And this
> will lead network fault in VM.
Interesting; the only reason I can think that would happen, is because
arch_init.c:ram_save_page uses qemu_put_buffer_async to send most of the RAM
pages
and that won't make sure the write happens until later.
This fix will probably help other migration code as well; postcopy
runs the migration for a lot longer after stopping the CPUs, and I am seeing
something that might be due to this very occasionally.
Dave
> Here i have made a test:
> (1) Download the new qemu.
> (2) Extend the time window between qemu_savevm_state_complete and
> migrate_fd_cleanup(where qemu_fclose(s->file) will be called), patch
> like(this will extend the chances of receiving packets between the time
> window):
> diff --git a/migration.c b/migration.c
> index 8d675b3..597abf9 100644
> --- a/migration.c
> +++ b/migration.c
> @@ -614,7 +614,7 @@ static void *migration_thread(void *opaque)
> qemu_savevm_state_complete(s->file);
> }
> qemu_mutex_unlock_iothread();
> -
> + sleep(2);/*extend the time window between stop vm and end
> migration*/
> if (ret < 0) {
> migrate_set_state(s, MIG_STATE_ACTIVE,
> MIG_STATE_ERROR);
> break;
> (3) Start VM (suse11sp1) and in VM ping xxx -i 0.1
> (4) Migrate the VM to another host.
>
> And the *PING* command in VM will very likely fail with message:
> 'Destination HOST Unreachable', the NIC in VM will stay unavailable
> unless you run 'service network restart'.(without step 2, you may have
> to migration lots of times between two hosts before that happens).
>
> Thanks,
> zhanghailiang
>
> >If that's true, how does this problem happen - can the net code
> >still receive packets with the iothread lock taken?
> >qemu_savevm_state_complete does a call to the RAM code, so I think this
> >should get
> >any RAM changes that happened before the lock was taken.
> >
> >I'm gently worried about threading stuff as well - is all this net code
> >running off fd handlers on the main thread or are there separate threads?
> >
> >Dave
> >
> >>To avoid this, we forbid receiving packets in generic net code when
> >>VM is not running. Also, when the runstate changes back to running,
> >>we definitely need to flush queues to get packets flowing again.
> >>
> >>Here we implement this in the net layer:
> >>(1) Judge the vm runstate in qemu_can_send_packet
> >>(2) Add a member 'VMChangeStateEntry *vmstate' to struct NICState,
> >>Which will listen for VM runstate changes.
> >>(3) Register a handler function for VMstate change.
> >>When vm changes back to running, we flush all queues in the callback
> >>function.
> >>(4) Remove checking vm state in virtio_net_can_receive
> >>
> >>Signed-off-by: zhanghailiang<address@hidden>
> >>---
> >> hw/net/virtio-net.c | 4 ----
> >> include/net/net.h | 2 ++
> >> net/net.c | 32 ++++++++++++++++++++++++++++++++
> >> 3 files changed, 34 insertions(+), 4 deletions(-)
> >>
> >>diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> >>index 268eff9..287d762 100644
> >>--- a/hw/net/virtio-net.c
> >>+++ b/hw/net/virtio-net.c
> >>@@ -839,10 +839,6 @@ static int virtio_net_can_receive(NetClientState *nc)
> >> VirtIODevice *vdev = VIRTIO_DEVICE(n);
> >> VirtIONetQueue *q = virtio_net_get_subqueue(nc);
> >>
> >>- if (!vdev->vm_running) {
> >>- return 0;
> >>- }
> >>-
> >> if (nc->queue_index>= n->curr_queues) {
> >> return 0;
> >> }
> >>diff --git a/include/net/net.h b/include/net/net.h
> >>index ed594f9..a294277 100644
> >>--- a/include/net/net.h
> >>+++ b/include/net/net.h
> >>@@ -8,6 +8,7 @@
> >> #include "net/queue.h"
> >> #include "migration/vmstate.h"
> >> #include "qapi-types.h"
> >>+#include "sysemu/sysemu.h"
> >>
> >> #define MAX_QUEUE_NUM 1024
> >>
> >>@@ -96,6 +97,7 @@ typedef struct NICState {
> >> NICConf *conf;
> >> void *opaque;
> >> bool peer_deleted;
> >>+ VMChangeStateEntry *vmstate;
> >> } NICState;
> >>
> >> NetClientState *qemu_find_netdev(const char *id);
> >>diff --git a/net/net.c b/net/net.c
> >>index 6d930ea..21f0d48 100644
> >>--- a/net/net.c
> >>+++ b/net/net.c
> >>@@ -242,6 +242,29 @@ NetClientState *qemu_new_net_client(NetClientInfo
> >>*info,
> >> return nc;
> >> }
> >>
> >>+static void nic_vmstate_change_handler(void *opaque,
> >>+ int running,
> >>+ RunState state)
> >>+{
> >>+ NICState *nic = opaque;
> >>+ NetClientState *nc;
> >>+ int i, queues;
> >>+
> >>+ if (!running) {
> >>+ return;
> >>+ }
> >>+
> >>+ queues = MAX(1, nic->conf->peers.queues);
> >>+ for (i = 0; i< queues; i++) {
> >>+ nc =&nic->ncs[i];
> >>+ if (nc->receive_disabled
> >>+ || (nc->info->can_receive&& !nc->info->can_receive(nc))) {
> >>+ continue;
> >>+ }
> >>+ qemu_flush_queued_packets(nc);
> >>+ }
> >>+}
> >>+
> >> NICState *qemu_new_nic(NetClientInfo *info,
> >> NICConf *conf,
> >> const char *model,
> >>@@ -259,6 +282,8 @@ NICState *qemu_new_nic(NetClientInfo *info,
> >> nic->ncs = (void *)nic + info->size;
> >> nic->conf = conf;
> >> nic->opaque = opaque;
> >>+ nic->vmstate =
> >>qemu_add_vm_change_state_handler(nic_vmstate_change_handler,
> >>+ nic);
> >>
> >> for (i = 0; i< queues; i++) {
> >> qemu_net_client_setup(&nic->ncs[i], info, peers[i], model, name,
> >>@@ -379,6 +404,7 @@ void qemu_del_nic(NICState *nic)
> >> qemu_free_net_client(nc);
> >> }
> >>
> >>+ qemu_del_vm_change_state_handler(nic->vmstate);
> >> g_free(nic);
> >> }
> >>
> >>@@ -452,6 +478,12 @@ void qemu_set_vnet_hdr_len(NetClientState *nc, int len)
> >>
> >> int qemu_can_send_packet(NetClientState *sender)
> >> {
> >>+ int vmstat = runstate_is_running();
> >>+
> >>+ if (!vmstat) {
> >>+ return 0;
> >>+ }
> >>+
> >> if (!sender->peer) {
> >> return 1;
> >> }
> >>--
> >>1.7.12.4
> >>
> >>
> >>
> >--
> >Dr. David Alan Gilbert / address@hidden / Manchester, UK
> >
> >.
> >
>
>
>
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK