[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] Intermittent e1000 failure on qemu-kvm 1.0
From: |
Chris Webb |
Subject: |
Re: [Qemu-devel] Intermittent e1000 failure on qemu-kvm 1.0 |
Date: |
Tue, 3 Apr 2012 13:42:18 +0100 |
User-agent: |
Mutt/1.5.20 (2009-06-14) |
Stefan Hajnoczi <address@hidden> writes:
> In a case like this it might be most effective to catch a VM in the
> bad state and then go in with gdb to see what is broken. The basic
> approach would be putting breakpoints on the e1000 device model's
> transmit/receive paths to see if the guest is giving us packets and
> whether the tap device is transmitting/receiving. If guest and host
> appear to be working then QEMU's e1000 model must be in a bad state
> and it's a question of looking at the tx/rx rings and other hardware
> emulation state to figure out what went wrong.
Hi Stefan. I tried setting a breakpoint on start_xmit, but the qemu blew up
when I hit it:
(gdb) break /home/root/packages/qemu-kvm-1.0/src-hrw66F/hw/e1000.c:start_xmit
Function "start_xmit" not defined.
Make breakpoint pending on future shared library load? (y or [n]) n
(gdb) break /home/root/packages/qemu-kvm-1.0/src-hrw66F/hw/e1000.c:528
Breakpoint 1 at 0x46dcd6: file
/home/root/packages/qemu-kvm-1.0/src-hrw66F/hw/e1000.c, line 528.
(gdb) cont
Continuing.
Program terminated with signal SIGTRAP, Trace/breakpoint trap.
The program no longer exists.
I assume this is some subtlety with breakpointing threaded code?
However, along these lines, I note that the guest appears to have received
packets, though this count is stuck at 1993 bytes. The TX count marches upwards
as I ping outbound from the guest.
If I attach a tcpdump to tap1 on the host, I see the ARP requests going out and
apparently no reply:
0024# tcpdump -i tap1
tcpdump: WARNING: tap1: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tap1, link-type EN10MB (Ethernet), capture size 65535 bytes
12:08:35.654992 ARP, Request who-has 84.45.8.129 tell 84.45.8.242, length 28
12:08:36.654976 ARP, Request who-has 84.45.8.129 tell 84.45.8.242, length 28
12:08:37.654975 ARP, Request who-has 84.45.8.129 tell 84.45.8.242, length 28
12:08:38.670933 ARP, Request who-has 84.45.8.129 tell 84.45.8.242, length 28
12:08:39.670922 ARP, Request who-has 84.45.8.129 tell 84.45.8.242, length 28
12:08:40.670908 ARP, Request who-has 84.45.8.129 tell 84.45.8.242, length 28
Looking on br0, I do seem to see the replies:
12:12:53.509471 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has
84.45.8.129 tell 84.45.8.242, length 28
12:12:53.509914 ARP, Ethernet (len 6), IPv4 (len 4), Reply 84.45.8.129 is-at
00:13:c3:35:a6:42 (oui Unknown), length 46
12:12:54.509455 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has
84.45.8.129 tell 84.45.8.242, length 28
12:12:54.509875 ARP, Ethernet (len 6), IPv4 (len 4), Reply 84.45.8.129 is-at
00:13:c3:35:a6:42 (oui Unknown), length 46
12:12:55.509447 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has
84.45.8.129 tell 84.45.8.242, length 28
12:12:55.509878 ARP, Ethernet (len 6), IPv4 (len 4), Reply 84.45.8.129 is-at
00:13:c3:35:a6:42 (oui Unknown), length 46
12:12:56.525424 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has
84.45.8.129 tell 84.45.8.242, length 28
12:12:56.525854 ARP, Ethernet (len 6), IPv4 (len 4), Reply 84.45.8.129 is-at
00:13:c3:35:a6:42 (oui Unknown), length 46
12:12:57.525408 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has
84.45.8.129 tell 84.45.8.242, length 28
12:12:57.525837 ARP, Ethernet (len 6), IPv4 (len 4), Reply 84.45.8.129 is-at
00:13:c3:35:a6:42 (oui Unknown), length 46
but they never get to tap1 despite STP being disabled and no bridge port
filtering:
# ebtables -L
Bridge table: filter
Bridge chain: INPUT, entries: 0, policy: ACCEPT
Bridge chain: FORWARD, entries: 0, policy: ACCEPT
Bridge chain: OUTPUT, entries: 0, policy: ACCEPT
# brctl show br0
bridge name bridge id STP enabled interfaces
br0 8000.002590224ffa no eth0
This looks uncannily like a kernel problem doesn't it? However, remove the
-usbdevice tablet, and it goes away, which is truly weird! I've just done a
hundred successful reboots without it once again to confirm to myself that I'm
definitely not imagining that behaviour.
> Have you tried unloading the e1000 kernel module inside the guest and
> then modprobing it again? Does this "fix" the issue?
Hadn't thought of that, but no, it apparently has no effect. It's still broken
after I rmmod it, modprobe it again, and reconfigure the networking.
Cheers,
Chris.
- [Qemu-devel] Intermittent e1000 failure on qemu-kvm 1.0, Chris Webb, 2012/04/02
- Re: [Qemu-devel] Intermittent e1000 failure on qemu-kvm 1.0, Stefan Hajnoczi, 2012/04/03
- Re: [Qemu-devel] Intermittent e1000 failure on qemu-kvm 1.0, Chris Webb, 2012/04/03
- Re: [Qemu-devel] Intermittent e1000 failure on qemu-kvm 1.0, Stefan Hajnoczi, 2012/04/03
- Re: [Qemu-devel] Intermittent e1000 failure on qemu-kvm 1.0,
Chris Webb <=
- Re: [Qemu-devel] Intermittent e1000 failure on qemu-kvm 1.0, Stefan Hajnoczi, 2012/04/03
- Re: [Qemu-devel] Intermittent e1000 failure on qemu-kvm 1.0, Chris Webb, 2012/04/03
- Re: [Qemu-devel] Intermittent e1000 failure on qemu-kvm 1.0, Stefan Hajnoczi, 2012/04/03
- Re: [Qemu-devel] Intermittent e1000 failure on qemu-kvm 1.0, Chris Webb, 2012/04/03
- Re: [Qemu-devel] Intermittent e1000 failure on qemu-kvm 1.0, Stefan Hajnoczi, 2012/04/11
- Re: [Qemu-devel] Intermittent e1000 failure on qemu-kvm 1.0, Chris Webb, 2012/04/12
- Re: [Qemu-devel] Intermittent e1000 failure on qemu-kvm 1.0, Stefan Hajnoczi, 2012/04/12
- Re: [Qemu-devel] Intermittent e1000 failure on qemu-kvm 1.0, Chris Webb, 2012/04/20