[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] TCP Segementation Offloading
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] TCP Segementation Offloading |
Date: |
Fri, 6 May 2016 17:28:55 +0100 |
User-agent: |
Mutt/1.6.0 (2016-04-01) |
On Fri, May 06, 2016 at 06:34:33AM +0200, Ingo Krabbe wrote:
> > On Sun, May 01, 2016 at 02:31:57PM +0200, Ingo Krabbe wrote:
> >> Good Mayday Qemu Developers,
> >>
> >> today I tried to find a reference to a networking problem, that seems to
> >> be of quite general nature: TCP Segmentation Offloading (TSO) in virtual
> >> environments.
> >>
> >> When I setup TAP network adapter for a virtual machine and put it into a
> >> host bridge, the known best practice is to manually set "tso off gso off"
> >> with ethtool, for the guest driver if I use a hardware emulation, such as
> >> e1000 and/or "tso off gso off" for the host driver and/or for the bridge
> >> adapter, if I use the virtio driver, as otherwise you experience
> >> (sometimes?) performance problems or even lost packages.
> >
> > I can't parse this sentence. In what cases do you think it's a "known
> > best practice" to disable tso and gso? Maybe a table would be a clearer
> > way to communicate this.
> >
> > Can you provide a link to the source claiming tso and gso should be
> > disabled?
>
> Sorry for that long sentence. The consequence seems to be, that it is most
> stable to turn off tso and gso for host bridges and for adapters in virtual
> machines.
>
> One of the most comprehensive collections of arguments is this article
>
>
> https://kris.io/2015/10/01/kvm-network-performance-tso-and-gso-turn-it-off/
>
> while I also found a documentation for Centos 6
>
>
> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Host_Configuration_and_Guest_Installation_Guide/ch10s04.html
This documentation is about (ancient) RHEL 3.9 guests. I would not
apply anything on that page to modern Linux distro releases without
re-checking.
>
> In google groups this one is discussed
>
> https://code.google.com/p/ganeti/wiki/PerformanceTuning
>
> Of course the same is found for Xen Machines
>
> http://cloudnull.io/2012/07/xenserver-network-tuning/
>
> You see there are several Links in the internet and my first question is: Why
> can't I find this discussion in the qemu-wiki space.
>
> I think the bug
>
> https://bugs.launchpad.net/bugs/1202289
>
> is related.
Thanks for posting all the links!
I hope Michael and/or Jason explain the current status for RHEL 6/7 and
other modern distros. Maybe they can also follow up with the kris.io
blog author if an update to the post is necessary.
TSO/GSO is enabled by default on my Fedora and RHEL host/guests. If it
was a best practice for those distros I'd expect the default settings to
reflect that. Also, I would be surprised if the offload features were
bad since work was put into supporting and extending them in virtio-net
over the years.
> >> I haven't found a complete analysis of the background of these problems,
> >> but there seem to be some effects on MTU based fragmentation and UDP
> >> checksums.
> >>
> >> There is a tso related bug on launchpad, but the context of this bug is
> >> too narrow, for the generality of the problem.
> >>
> >> Also it seems that there is a problem in LXC contexts too (I found such a
> >> reference, without detailed description in a Post about Xen setup).
> >>
> >> My question now is: Is there a bug in the driver code and shouldn't this
> >> be documented somewhere in wiki.qemu.org? Where there developments about
> >> this topic in the past or is there any planned/ongoing work todo on the
> >> qemu drivers?
> >>
> >> Most problem reports found relate to deprecated Centos6 qemu-kvm packages.
> >>
> >> In our company we have similar or even worse problems with Centos7 hosts
> >> and guest machines.
> >
> > Have haven't explained what problem you are experiencing. If you want
> > help with your setup please include your QEMU command-line (ps aux |
> > grep qemu), the traffic pattern (ideally how to reproduce it with a
> > benchmarking tool), and what observation you are making (e.g. netstat
> > counters showing dropped packets).
>
> I was quite astonished about the many hints about virtio drivers as we had
> this problem with the e1000 driver in a Centos7 Guest on a Centos6 Host.
>
> e1000 0000:00:03.0 ens3: Detected Tx Unit Hang#012 Tx Queue
> <0>#012 TDH <42>#012 TDT <42>#012
> next_to_use <2e>#012 next_to_clean
> <42>#012buffer_info[next_to_clean]#012 time_stamp <104aff1b8>#012
> next_to_watch <44>#012 jiffies <104b00ee9>#012
> next_to_watch.status <0>
> Apr 25 21:08:48 db03 kernel: ------------[ cut here ]------------
> Apr 25 21:08:48 db03 kernel: WARNING: at net/sched/sch_generic.c:297
> dev_watchdog+0x270/0x280()
> Apr 25 21:08:48 db03 kernel: NETDEV WATCHDOG: ens3 (e1000): transmit
> queue 0 timed out
> Apr 25 21:08:48 db03 kernel: Modules linked in: binfmt_misc ipt_REJECT
> nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip6t_REJECT nf_conntrack_ipv6
> nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables btrfs
> zlib_deflate raid6_pq xor ext4 mbcache jbd2 crc32_pclmul ghash_clmulni_intel
> aesni_intel lrw gf128mul glue_helper ablk_helper i2c_piix4 ppdev cryptd
> pcspkr virtio_balloon parport_pc parport sg nfsd auth_rpcgss nfs_acl lockd
> grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic
> ata_generic pata_acpi virtio_scsi cirrus syscopyarea sysfillrect sysimgblt
> drm_kms_helper ttm drm crct10dif_pclmul crct10dif_common ata_piix
> crc32c_intel virtio_pci e1000 i2c_core virtio_ring libata serio_raw virtio
> floppy dm_mirror dm_region_hash dm_log dm_mod
> Apr 25 21:08:48 db03 kernel: CPU: 2 PID: 0 Comm: swapper/2 Not tainted
> 3.10.0-327.13.1.el7.x86_64 #1
> Apr 25 21:08:48 db03 kernel: Hardware name: Red Hat KVM, BIOS 0.5.1
> 01/01/2007
> Apr 25 21:08:48 db03 kernel: ffff88126f483d88 685d892e8a452abb
> ffff88126f483d40 ffffffff8163571c
> Apr 25 21:08:48 db03 kernel: ffff88126f483d78 ffffffff8107b200
> 0000000000000000 ffff881203b9a000
> Apr 25 21:08:48 db03 kernel: ffff881201c3e080 0000000000000001
> 0000000000000002 ffff88126f483de0
> Apr 25 21:08:48 db03 kernel: Call Trace:
> Apr 25 21:08:48 db03 kernel: <IRQ> [<ffffffff8163571c>]
> dump_stack+0x19/0x1b
> Apr 25 21:08:48 db03 kernel: [<ffffffff8107b200>]
> warn_slowpath_common+0x70/0xb0
> Apr 25 21:08:48 db03 kernel: [<ffffffff8107b29c>]
> warn_slowpath_fmt+0x5c/0x80
> Apr 25 21:08:48 db03 kernel: [<ffffffff8154cd40>]
> dev_watchdog+0x270/0x280
> Apr 25 21:08:48 db03 kernel: [<ffffffff8154cad0>] ?
> dev_graft_qdisc+0x80/0x80
> Apr 25 21:08:48 db03 kernel: [<ffffffff8108b0a6>]
> call_timer_fn+0x36/0x110
> Apr 25 21:08:48 db03 kernel: [<ffffffff8154cad0>] ?
> dev_graft_qdisc+0x80/0x80
> Apr 25 21:08:48 db03 kernel: [<ffffffff8108dd97>]
> run_timer_softirq+0x237/0x340
> Apr 25 21:08:48 db03 kernel: [<ffffffff81084b0f>]
> __do_softirq+0xef/0x280
> Apr 25 21:08:48 db03 kernel: [<ffffffff816477dc>] call_softirq+0x1c/0x30
> Apr 25 21:08:48 db03 kernel: [<ffffffff81016fc5>] do_softirq+0x65/0xa0
> Apr 25 21:08:48 db03 kernel: [<ffffffff81084ea5>] irq_exit+0x115/0x120
> Apr 25 21:08:48 db03 kernel: [<ffffffff81648455>]
> smp_apic_timer_interrupt+0x45/0x60
> Apr 25 21:08:48 db03 kernel: [<ffffffff81646b1d>]
> apic_timer_interrupt+0x6d/0x80
> Apr 25 21:08:48 db03 kernel: <EOI> [<ffffffff81058e96>] ?
> native_safe_halt+0x6/0x10
> Apr 25 21:08:48 db03 kernel: [<ffffffff8101dbcf>] default_idle+0x1f/0xc0
> Apr 25 21:08:48 db03 kernel: [<ffffffff8101e4d6>]
> arch_cpu_idle+0x26/0x30
> Apr 25 21:08:48 db03 kernel: [<ffffffff810d6325>]
> cpu_startup_entry+0x245/0x290
> Apr 25 21:08:48 db03 kernel: [<ffffffff810475fa>]
> start_secondary+0x1ba/0x230
> Apr 25 21:08:48 db03 kernel: ---[ end trace 71ac4360272e207e ]---
> Apr 25 21:08:48 db03 kernel: e1000 0000:00:03.0 ens3: Reset adapter
>
>
> I'm still not sure why this happens on this host "db03", while db02 and db01
> are not affected. All guests are running on different hosts and the network
> is controlled by an openvswitch.
This looks interesting. It could be a bug in QEMU's e1000 NIC
emulation. Maybe it has already been fixed in qemu.git but I didn't see
any relevant commits.
Please post the RPM version numbers you are using (rpm -qa | grep qemu
in host, rpm -qa | grep kernel in host).
The e1000 driver can print additional information (to dump the contents
of the tx ring). Please increase your kernel's log level to collect
that information:
# echo 8 >/proc/sys/kernel/printk
The tx ring dump may allow someone to figure out why the packet caused
tx to stall.
Stefan
signature.asc
Description: PGP signature