qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 2/2] e1000: decrement RDT if equal to RDH


From: Jason Wang
Subject: Re: [Qemu-devel] [PATCH v2 2/2] e1000: decrement RDT if equal to RDH
Date: Wed, 14 Jan 2015 17:48:43 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0

On 01/14/2015 05:06 AM, Richard Tollerton wrote:
> Jason Wang <address@hidden> writes:
>
>> On Tue, Jan 13, 2015 at 3:12 AM, Richard Tollerton <address@hidden> wrote:
>>> On Thu, Dec 18, 2014 at 12:01:48AM -0500, Jason Wang wrote:
>>>
>>>>  > Some drivers set RDT=RDH. Oddly, this works on real hardware. To
>>>>  > work around this, autodecrement RDT when this happens.
>>>>  
>>>>  Please describe more details on the issue. The spec 3.2.6 said:
>>>> "When the head pointer is equal to the tail pointer, the ring is
>>>> empty." So RDT=RDH in fact empty the ring. No?
>>> That is incorrect; the spec explicitly states that RDT=RDH means the
>>> ring is full. The linux e1000 driver more or less implies the same
>>> thing.
>>>
>>> You forgot to include the sentence after that in section 3.2.6 :)
>>>
>>> "When the head pointer is equal to the tail pointer, the ring is
>>> empty. Hardware stops storing packets in system memory until software
>>> advances the tail pointer, making more receive buffers available."
>>>
>>> Yeah, this seems really poorly worded to me too. :( You appear to be
>>> interpreting "ring is empty" in the usual sense, i.e. "all N elements
>>> of the ring buffer are available for use by hardware". In fact, the
>>> correct interpretation [1] is the exact opposite, "none of the
>>> elements are available for use by hardware".
>> Yes, I do think 'empty' means no available buffer for device to receive 
>> :)
> Ah, ok. But if you're concerned about breaking drivers with this... what
> legitimate reason could a driver possibly have to set RDT=RDH? (Besides a
> mistaken attempt to free the ring for hardware use, as I posit?)

One example is initialization in Linux (e1000_configure_rx()). What it
does is:

    /* disable receives while setting up the descriptors */
    rctl = er32(RCTL);
    ew32(RCTL, rctl & ~E1000_RCTL_EN);
...
        ew32(RDT, 0);
        ew32(RDH, 0);
...
    /* Enable Receives */
    ew32(RCTL, rctl | E1000_RCTL_EN);

And the rx buffer allocations were done later. So with your patch, after
rx was enabled but before  rx buffer were allocated. Since RDT was set
to RDLEN, e1000_has_rxbuf() will return true. Qemu will try to receive
packet to uninitialized buffers. This seems wrong.
>
> The only reason I can think of is that maybe a driver is trying to
> implement some sort of crude flow control. But if that actually worked,
> then major packet loss would be observed under load, as any packets
> received by hardware but not yet processed by software would get
> dropped.
>
> I'm going to go out (further) on a limb and assert that no driver ever
> sets RDT=RDH to stop receives, because no hardware implements that
> behavior. The driver I'm trying to get working appears to have been
> setting RDT=RDH at the end of *every* iteration of the receive loop,
> since it was originally written in 2003. If I am to trust the comments,
> it's been ported/supported on 28 different chipset variants, and it's
> definitely been tested for performance and packet loss under load for a
> good number of those, including under polling modes where the ring is
> only checked every few milliseconds. If RDT=RDH ever did anything except
> free all buffers for hardware use, surely catastrophic packet loss would
> have been observed?

>From device's point of view. It just need stop receiving when RDT=RDH.
It does not care whether the ring was full of received packets or empty.

I probably miss something but could you please show the (pseudo) code of
your driver to see why current qemu does not work?
>
>>> The last sentence in the quote makes this explicit. See also linux
>>> e1000 driver sources at [2] [3] [4].
>> Btw, [2],[3],[4] are all codes that deal with driver's internal
>> variable, not the one that the hardware use.
> No, it's directly used by hardware -- current_count increments RDT. See
> e1000_alloc_rx_buffers().

Do you mean cleaned_count? (Didn't find current_count).
>
>>> I'm *guessing* (?) that the DMA engine isn't correspondingly stopped
>>> when software sets RDT=RDH, so that once packets start getting
>>> received,
>> Do you mean in qemu? I/O are single threaded, so looks like we are
>> safe.
> I'm referring to the possibility that physical hardware is doing this.
>
> Interesting side note: while this driver sets RDT=RDH on every
> iteration, it *initializes* RDT=0 and RDH=1...

I want to know more about this driver.

RDT=0 and RDH=1 means all buffers were available for device to receive.
If no rx buffer refilling happens, RDH will be finally advanced by
device and finally equal to RDT and then receiving is stopped. When will
driver set RDT=RDH?
>>> the hardware can more or less ignore it. In this context, my patch
>>> makes sense.
>>>
>>> (Yes, this is totally an ex-post-facto justification for the patch; it
>>> arrived to me secondhand, and I had not been familiar with the driver
>>> source before now.)
>> True, we've found many undocumented behavior in the past (some even 
>> conflicts with spec). I don't have a 82540EM in my hand, but I think 
>> the best thing is to check this behavior in real hardware to prevent 
>> this patch from breaking many existing drivers.
> Can you be more specific in regards to what information you're
> requesting? Are you wanting additional confirmation (i.e. via
> instrumented code in addition to code inspection) that setting RDT=RDH
> does not stop packet receive once it has started?
>

Just a test to see if a real 82540 card behaves like this patch. E.g.

write(RDH, 0)
write(RDT, 0)
read(RDT)

To see if RDT is zero or RDLEN.
>>> [1] http://sourceforge.net/p/e1000/mailman/message/29280078/
>> This issue mentioned in the thread seems solved.
> Indeed; I was citing this thread (and the other thread) for the
> discussions, not the issues themselves.
>
>>> [2] 
>>> http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/e1000/e1000_main.c#L398
>>> [3] 
>>> http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/e1000/e1000.h#L215
>>> [4] 
>>> http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/e1000/e1000_main.c#L4302
>>> [5] http://sourceforge.net/p/e1000/mailman/message/29969887/
>> Looks like what mentioned in this thread was also solved.
>>
>> We check both RCTL and e1000_has_rxbufs() in e1000_can_receive().
>> And flush queued packets in set_rx_control().
>>>>>  > diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>>>>>  > index 44ae3a8..b8cbfc1 100644
>>>>>  > --- a/hw/net/e1000.c
>>>>>  > +++ b/hw/net/e1000.c
>>>>>  > @@ -1152,6 +1152,12 @@ mac_writereg(E1000State *s, int index, 
>>>>> uint32_t val)
>>>>>  >  static void
>>>>>  >  set_rdt(E1000State *s, int index, uint32_t val)
>>>>>  >  {
>>>>>  > +    if (val == s->mac_reg[RDH]) {     /* Decrement RDT if it's 
>>>>> too big */
>>>>>  > +        if (val == 0) {
>>>>>  > +            val = s->mac_reg[RDLEN] / sizeof(struct 
>>>>> e1000_rx_desc);
>>>>>  > +        }
>>>>>  > +        val--;
>>>>>  > +    }
>>>>>  >      s->mac_reg[index] = val & 0xffff;
>>>>>  >      if (e1000_has_rxbufs(s, 1)) {
>>>>>  >          qemu_flush_queued_packets(qemu_get_queue(s->nic));
>>>>>  > --
>>>>>  > 2.1.3
>>>>>  > 
>>>>>  > 
>>>>>  > 
>>> -- 
>>> Richard Tollerton <address@hidden>
>>>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]