qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] virtio-net: announce self by guest


From: Jason Wang
Subject: Re: [Qemu-devel] [PATCH] virtio-net: announce self by guest
Date: Fri, 16 May 2014 13:02:51 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0

On 05/15/2014 05:45 PM, Michael S. Tsirkin wrote:
> On Thu, May 15, 2014 at 05:22:28PM +0800, Jason Wang wrote:
>> On 05/15/2014 04:28 PM, Michael S. Tsirkin wrote:
>>> Thanks, looks good.
>>> Some minor comments below,
>>>
>>> On Thu, May 15, 2014 at 03:16:47PM +0800, Jason Wang wrote:
>>>> It's hard to track all mac addresses and their configurations (e.g
>>>> vlan or ipv6) in qemu. Without those informations, it's impossible to
>>> s/those informations/this information/
>>>
>>>> build proper garp packet after migration. The only possible solution
>>>> to this is let guest (who knew all configurations) to do this.
>>> s/knew/knows/
>>>
>>>> So, this patch introduces a new readonly config status bit of virtio-net,
>>>> VIRTIO_NET_S_ANNOUNCE which is used to notify guest to announce
>>>> presence of its link through config update interrupt.When guest has
>>>> done the announcement, it should ack the notification through
>>>> VIRTIO_NET_CTRL_ANNOUNCE_ACK cmd. This feature is negotiated by a new
>>>> feature bit VIRTIO_NET_F_ANNOUNCE (which has already been supported by
>>>> Linux guest).
>>>>
>>>> During load, a counter of announcing rounds were set so that the after
>>> s/were/is/
>>> s/the after/after/
>> Will correct those typos.
>>>> the vm is running it can trigger rounds of config interrupts to notify
>>>> the guest to build and send the correct garps.
>>>>
>>>> Tested with ping to guest with vlan during migration. Without the
>>>> patch, lots of the packets were lost after migration. With the patch,
>>>> could not notice packet loss after migration.
>>> below changelog should go after ---, until the ack list.
>>>
>> Ok.
>>>> Reference:
>>>> RFC v2: https://lists.gnu.org/archive/html/qemu-devel/2014-04/msg01750.html
>>>> RFC v1: https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg02648.html
>>>> V7:     https://lists.gnu.org/archive/html/qemu-devel/2013-03/msg01127.html
>>>>
>>>> Changes from RFC v2:
>>>> - use QEMU_CLOCK_VIRTUAL instead of QEMU_CLOCK_REALTIME
>>>> - compat self announce for 2.0 machine type
>>>>
>>>> Changes from RFC v1:
>>>> - clean VIRTIO_NET_S_ANNOUNCE bit during reset
>>>> - free announce timer during clean
>>>> - make announce work for non-vhost case
>>>>
>>>> Changes from V7:
>>>> - Instead of introducing a global method for each kind of nic, this
>>>>   version limits the changes to virtio-net itself.
>>>>
>>>> Cc: Liuyongan <address@hidden>
>>>> Cc: Amos Kong <address@hidden>
>>>> Signed-off-by: Jason Wang <address@hidden>
>>>> ---
>>>>  hw/net/virtio-net.c            |   48 
>>>> ++++++++++++++++++++++++++++++++++++++++
>>>>  include/hw/i386/pc.h           |    5 ++++
>>>>  include/hw/virtio/virtio-net.h |   16 +++++++++++++
>>>>  3 files changed, 69 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>>>> index 940a7cf..98d59e9 100644
>>>> --- a/hw/net/virtio-net.c
>>>> +++ b/hw/net/virtio-net.c
>>>> @@ -25,6 +25,7 @@
>>>>  #include "monitor/monitor.h"
>>>>  
>>>>  #define VIRTIO_NET_VM_VERSION    11
>>>> +#define VIRTIO_NET_ANNOUNCE_ROUNDS    3
>>>>  
>>>>  #define MAC_TABLE_ENTRIES    64
>>>>  #define MAX_VLAN    (1 << 12)   /* Per 802.1Q definition */
>>> I would make it  5 to be consistent with SELF_ANNOUNCE_ROUNDS
>>> in savevm.c
>>>
>>> in fact, let's move SELF_ANNOUNCE_ROUNDS to include/migration/vmstate.h
>>> and reuse it.
>> Ok.
>>>> @@ -99,6 +100,25 @@ static bool virtio_net_started(VirtIONet *n, uint8_t 
>>>> status)
>>>>          (n->status & VIRTIO_NET_S_LINK_UP) && vdev->vm_running;
>>>>  }
>>>>  
>>>> +static void virtio_net_announce_timer(void *opaque)
>>>> +{
>>>> +    VirtIONet *n = opaque;
>>>> +    VirtIODevice *vdev = VIRTIO_DEVICE(n);
>>>> +
>>>> +    if (n->announce &&
>>> I would make it > 0 here, just in case it becomes negative as a result
>>> of some bug.
>> Sure.
>>>> +        virtio_net_started(n, vdev->status) &&
>>>> +        vdev->guest_features & (0x1 << VIRTIO_NET_F_GUEST_ANNOUNCE) &&
>>>> +        vdev->guest_features & (0x1 << VIRTIO_NET_F_CTRL_VQ)) {
>>>> +
>>>> +        n->announce--;
>>>> +        n->status |= VIRTIO_NET_S_ANNOUNCE;
>>>> +        virtio_notify_config(vdev);
>>>> +    } else {
>>>> +        timer_del(n->announce_timer);
>>> why do this here?
>>>
>>>> +        n->announce = 0;
>>> why is this here?
>>>
>> If guest shuts down the device, there's no need to do the announcing.
> It's still weird.
> Either announce is 0 and then timer was not running
> or it's > 0 and then this won't trigger.

Right, the logic here is for QEMU_CLOCK_REALTIME. But there's another
question, we use QEMU_CLOCK_VIRTUAL while qemu is using
QEMU_CLOCK_REALTIME for its announcing. This looks fine but not sure
whether this is safe. And if the link was down, it's better for us to
stop the announcing?

>
>>>> +    }
>>>> +}
>>>> +
>>>>  static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
>>>>  {
>>>>      VirtIODevice *vdev = VIRTIO_DEVICE(n);
>>>> @@ -147,6 +167,8 @@ static void virtio_net_set_status(struct VirtIODevice 
>>>> *vdev, uint8_t status)
>>>>  
>>>>      virtio_net_vhost_status(n, status);
>>>>  
>>>> +    virtio_net_announce_timer(n);
>>>> +
>>> why do this here? why not right after we set announce counter?
>> The reasons are:
>>
>> - The counters were set in load, but the device is not running so we
>> could not inject the interrupt at that time.
> I see. This makes sense - but this isn't intuitive.
> Why don't we simply start timer with current time?
> Need to make sure it runs fine if time passes, but
> I think it's fine.

Not sure I get the point, I didn't see any differences except for an
extra timer fire.
>
>> - We can stop the progress when guest is shutting down the device.
> On shut down guest will reset device stopping timer - this seems enough.

Yes, I see.
>>>>      for (i = 0; i < n->max_queues; i++) {
>>>>          q = &n->vqs[i];
>>>>  
>>>> @@ -322,6 +344,9 @@ static void virtio_net_reset(VirtIODevice *vdev)
>>>>      n->nobcast = 0;
>>>>      /* multiqueue is disabled by default */
>>>>      n->curr_queues = 1;
>>>> +    timer_del(n->announce_timer);
>>>> +    n->announce = 0;
>>>> +    n->status &= ~VIRTIO_NET_S_ANNOUNCE;
>>>>  
>>>>      /* Flush any MAC and VLAN filter table state */
>>>>      n->mac_table.in_use = 0;
>>>> @@ -731,6 +756,22 @@ static int virtio_net_handle_vlan_table(VirtIONet *n, 
>>>> uint8_t cmd,
>>>>      return VIRTIO_NET_OK;
>>>>  }
>>>>  
>>>> +static int virtio_net_handle_announce(VirtIONet *n, uint8_t cmd,
>>>> +                                      struct iovec *iov, unsigned int 
>>>> iov_cnt)
>>>> +{
>>>> +    if (cmd == VIRTIO_NET_CTRL_ANNOUNCE_ACK) {
>>>> +        n->status &= ~VIRTIO_NET_S_ANNOUNCE;
>>>> +        if (n->announce) {
>>> I would make it > 0 here, just in case it becomes negative as a result
>>> of some bug.
>> Ok.
>>>> +            timer_mod(n->announce_timer,
>>>> +                      qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 50 +
>>>> +                      (VIRTIO_NET_ANNOUNCE_ROUNDS - n->announce - 1) * 
>>>> 100);
>>> savevm.c has this code, and a comment:
>>>         /* delay 50ms, 150ms, 250ms, ... */
>>>     50 + (SELF_ANNOUNCE_ROUNDS - count - 1) * 100
>>>
>>> how about an API in include/migration/vmstate.h
>>>
>>> static inline
>>> int64_t self_announce_delay(int round)
>>> {
>>>     assert(round < SELF_ANNOUNCE_ROUNDS && round > 0);
>>>         /* delay 50ms, 150ms, 250ms, ... */
>>>     return 50 + (SELF_ANNOUNCE_ROUNDS - round - 1) * 100;
>>> }
>>>
>>> or something to this end.
>>>
>> Good idea, will do this.
>>>> +        }
>>>> +        return VIRTIO_NET_OK;
>>>> +    } else {
>>>> +        return VIRTIO_NET_ERR;
>>>> +    }
>>>> +}
>>>> +
>>>>  static int virtio_net_handle_mq(VirtIONet *n, uint8_t cmd,
>>>>                                  struct iovec *iov, unsigned int iov_cnt)
>>>>  {
>>>> @@ -794,6 +835,8 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, 
>>>> VirtQueue *vq)
>>>>              status = virtio_net_handle_mac(n, ctrl.cmd, iov, iov_cnt);
>>>>          } else if (ctrl.class == VIRTIO_NET_CTRL_VLAN) {
>>>>              status = virtio_net_handle_vlan_table(n, ctrl.cmd, iov, 
>>>> iov_cnt);
>>>> +        } else if (ctrl.class == VIRTIO_NET_CTRL_ANNOUNCE) {
>>>> +            status = virtio_net_handle_announce(n, ctrl.cmd, iov, 
>>>> iov_cnt);
>>>>          } else if (ctrl.class == VIRTIO_NET_CTRL_MQ) {
>>>>              status = virtio_net_handle_mq(n, ctrl.cmd, iov, iov_cnt);
>>>>          } else if (ctrl.class == VIRTIO_NET_CTRL_GUEST_OFFLOADS) {
>>>> @@ -1451,6 +1494,7 @@ static int virtio_net_load(QEMUFile *f, void 
>>>> *opaque, int version_id)
>>>>          qemu_get_subqueue(n->nic, i)->link_down = link_down;
>>>>      }
>>>>  
>>>> +    n->announce = VIRTIO_NET_ANNOUNCE_ROUNDS;
>>> Well if virtio_net_handle_announce runs now it will start timer
>>> in the past, right?
>>> This seems wrong.
>> Not sure I get the case. When in virtio_net_load() the vm is not even
>> running so looks like virtio_net_handle_announce() could not run in the
>> same time.
> I see, this works because you decrement it when VM starts running.
> I think it's not a good idea to rely on this though,
> better do everything from timer, and handle
> the case of command arriving too early.
>

Right, if QEMU_CLOCK_VIRTUAL is fine, we can do everything in a timer.

For the case of command arriving too early. Is there anything else we
need to do? Since we only start the next timer when we get ack command.

Thanks



reply via email to

[Prev in Thread] Current Thread [Next in Thread]