qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC v4 00/58] Memory API


From: Avi Kivity
Subject: Re: [Qemu-devel] [RFC v4 00/58] Memory API
Date: Wed, 20 Jul 2011 17:45:56 +0300
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.18) Gecko/20110621 Fedora/3.1.11-1.fc15 Thunderbird/3.1.11

On 07/20/2011 05:31 PM, Anthony Liguori wrote:
The VGA device doesn't know *if* it is mapped. It can be obstructed by
the chipset and by SMM. Other chipsets we emulate may support multiple
VGA cards.


The i440fx can support multiple VGA cards just fine.

Legacy region accesses are always routed by the PCI bus to the first PCI device that identifies itself as a graphics card.

The card is very well aware of the fact that it is getting legacy VGA accesses or not because only one card can register for this area.

But the current API doesn't support it. The card talks to the system address space directly.

The new API can support it just fine. But that requires having coalesced mmio in the API.


The e1000 does coalesced I/O for it's memory registers. But it's
dubious how much this actually matters anymore. The original claim was
a 10% boost with iperf.

The e1000 is not performance competitive with virtio-net though so it
certainly is reasonable to assume that noone would notice if we
removed coalesced I/O from the e1000.

The e1000 NIC is the best we have for guests that don't support virtio.
It's not reasonable to reduce its performance.

So let's talk about real numbers. This is netperf with a default invocation from guest to host. All numbers are MB/sec

rtl8139
-------
119.45
118.12

e1000 w/coalesced mmio
----------------------
425.93
424.08

e1000 w/o coalesced mmio
------------------------
419.13
413.83

virtio-net
----------
4330.52
4419.90

So removing coalesced MMIO from the e1000 results in a massive 0.7% slowdown :-)

And while the e100 is > 100% faster than the rtl8139, it's still an order of magnitude slower the userspace virtio-net.

Fine, we can drop coalesced mmio from e1000.  But not from vga.


I'm confident that the e1000 could be improved if someone modified it to optimally use the new netdev interfaces. But no one cares that much about the performance of the e1000. And if we dropped coalesced MMIO support for the e1000, no one would notice.

Exits costs have changed dramatically over the years. Optimizations that made sense with P4 class hardware don't necessary make sense these days. QEMU has also changed a lot so bottle necks are no longer where they used to be.

We either support coalesced mmio well, or not at all. Even if the API
has only one user, that doesn't excuse doing it badly.

It's not at all that black and white. We need to carefully choose what we model and then have the flexibility to break those models in the name of performance.

If we try to make everything fit elegantly into a model, we'll end up with something that's overly complex just to accommodate a single user. That's my general concern with where we're going here.

I don't think it's too bad and as I said, I don't object to it in it's current form. But I think it could be simplified. Even in it's current non-simple form, it's better than what we currently have.

I'm interested in how it could be simplified. It's complicated for me as well. But I don't think a side band API is possible.

--
error compiling committee.c: too many arguments to function




reply via email to

[Prev in Thread] Current Thread [Next in Thread]