lwip-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lwip-users] TCP bandwidth limited by rate of ACKs


From: Mason
Subject: Re: [lwip-users] TCP bandwidth limited by rate of ACKs
Date: Thu, 13 Oct 2011 12:02:42 +0200
User-agent: Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/20110928 Firefox/7.0.1 SeaMonkey/2.4.1

Simon wrote:

> Mason wrote:
> 
>> IMHO, the elephant in the room is task-switching, as correctly
>> pointed out by Kieran.

In my previous tests, I ran all three network-related threads
(RxTask, tcpip_thread, rxapp) at the same priority (MIN+4)
while other threads ran at the minimum priority.

Lowering the priority of rxapp to MIN+2 improved the throughput
to 65.7 Mbit/s (I think there are less context switches when
it's done like this, but I'm not sure how to measure this.)

> Well, given a correctly DMA-enabled driver, you could avoid one task 
> switch by checking RX packets from tcpip_thread instead of using another 
> thread for RX (as suggest your "Task breakdown" by the name "RxTask").

Correct; the OS panics when I call tcpip_input from the ISR,
so I set up RxTask which runs :

static void rx_task(void *arg)
{
  while ( 1 )
  {
    /*** WAIT FOR THE NEXT PACKET ***/
    ethernet_async_t *desc = message_receive(rx_queue);

    /*** PROCESS THE PACKET ***/
    struct pbuf *pbuf = pbuf_alloc(PBUF_RAW, desc->length, PBUF_RAM);
    memcpy(pbuf->payload, desc->buffer, desc->length);
    mynetif->input(pbuf, mynetif);

    /*** RETURN DESC TO THE LIST OF AVAILABLE READ DESCRIPTORS ***/
    int err = device_ioctl(dev, OSPLUS_IOCTL_ETH_ASYNC_READ, desc);
    if (err) printf("ASYNC_READ IOCTL FAIL\n");
  }
}

> You would then set a flag / post a static message from your ISR, process 
> the packet in tcpip_thread (without having to copy it) and post the data 
> to your application thread.
> 
> Also, by using the (still somewhat experimental) LWIP_TCPIP_CORE_LOCKING 
> feature, you can also avoid the task switch from application task to 
> tcpip_thread (by using a mutex to lock the core instead of passing a 
> message).

I wanted to enable LWIP_TCPIP_CORE_LOCKING (and LWIP_TCPIP_CORE_LOCKING_INPUT;
what is the difference? they're not documented AFAICS) but I was scared off by
the "Don't use it if you're not an active lwIP project member" comment ;-)

Also, my OS forbids using mutexes from an ISR.

Perhaps I could keep the RxTask, and enable LWIP_TCPIP_CORE_LOCKING_INPUT
which would take tcpip_thread out of the equation? Thus, LOCK_TCPIP_CORE
would be called from task context, which is fine.

>> Assuming that every memcpy were lwip-related, and that I could
>> get rid of them (which I don't see how, given Simon's comments)
>> the transfer would take 478 instead of 516 seconds.
>
> I didn't mean to discourage you with my comments, I only meant it 
> doesn't work out-of-the box with a current lwIP. However, I know it's 
> not as easy for an lwip beginner to make the changes required for the RX 
> side (the TX side should not be a problem via adapting the mem_malloc() 
> functions).
> 
> If I made the changes to support PBUF_REF for RX in git, would you be 
> able to switch to that for testing?

Yes, I've tried to do a clean port, so I should be able to upgrade
to a newer version quite easily.

> I plan to implement zero-copy on an ARM-based board I have here, but I 
> haven't found the time for that, lately :-(

I will follow that development closely.

-- 
Regards.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]