[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] virtio-net: Don't pass NULL peer to tap routine
From: |
Markus Armbruster |
Subject: |
Re: [Qemu-devel] [PATCH] virtio-net: Don't pass NULL peer to tap routines |
Date: |
Fri, 24 Sep 2010 11:31:32 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) |
Alex Williamson <address@hidden> writes:
> On Thu, 2010-09-23 at 12:43 -0500, Anthony Liguori wrote:
>> On 09/22/2010 02:52 PM, Alex Williamson wrote:
>> > During a hotplug, the netdev might be removed before the
unplug?
>> > connected virtio device. When this happens, the guest might
>> > be running cleanup operations that can trigger a segfault in
>> > qemu. Avoid one set of these by checking whether the peer
>> > device is present before trying to do tap operations.
>> >
>> > Signed-off-by: Alex Williamson<address@hidden>
>> >
>>
>> Can you explain this scenario a little better?
>>
>> If nc.peer is NULL when set_features is called, it would seem to me like
>> we're in a pretty critical state. I agree that we shouldn't set fault,
>> but I wonder if the real bug is that this can happen at all.
>
> Unfortunately that critical state happens all the time since device_del
> does an asynchronous ACPI call into the guest and libvirt isn't blocked
> waiting for that to complete and doesn't poll to see if the device goes
> away. So it's actually pretty common today that the netdev disappears
> before the device. We talked about this in the community call on
> Tuesday, and I think Michael is trying to think of a way to solve this,
> perhaps by separating the guest releasing the device from the device
> removal.
>
> In the mean time, virtio-net has this hole that seems like it can be
> avoided by simply checking some pointers on a slow path. Since the
> netdev has already disappeared, attempting to set features on it seems
> pointless. The change in the load function is really just a paranoia
> check since it followed the same model of calling tap_*() funcs w/o
> checking the value of nc.peer. Thanks,
I figure we should either make netdev_del fail when the netdev is in
use, or make its users cope graciously with the netdev going away (make
it look like somebody yanked the cable).