[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lwip-users] tcp_write with zero-copy
From: |
Jonathan Larmour |
Subject: |
Re: [lwip-users] tcp_write with zero-copy |
Date: |
Sun, 17 Feb 2008 01:03:17 +0000 |
User-agent: |
Mozilla Thunderbird 1.0.8-1.1.fc4 (X11/20060501) |
Timmy Brolin wrote:
Hi,
Yes, the rx pool may have to be slightly bigger, but the tx pool could
be set to almost zero instead.
Only in a limited subset of applications, I would have thought. Very few
protocols have responses which you only slightly modify, and send back,
keeping the same packet size; fewer still TCP-based ones (rather than UDP)
- I can't think of any. After all, TCP is stream-based so you have no idea
how many pieces your message will arrive in at the far end. Or if the
protocol isn't entirely synchronous or multiple packets of this protocol
can be sent at once, then there may be bits of subsequent packets within
the same pbufs. It seems a little like you're trying to make a quite
specific scenario more efficient based on guarantees that the underlying
protocol does not make.
Determining the optimum balance between
rx and tx pool sizes is not very easy as it is now. With true zero copy
there would be no such balance. Simply put all available memory into the
pbuf pool.
But then you run the risk of running out of configured space for receiving
data, because it's all used up with data for transmission. RX data has to
take priority, especially since it includes TCP ACKs.
Yes, the system may become more "memory efficient" in the sense that more
of the available memory is used at any time; but this is at the expense of
deterministic behaviour. It is more deterministic to have the general
principle of having a set of pbufs that are reserved only for rx data.
Today the application have to allocate a buffer for tx data before it
can free the rx buf, so momentarily there is twice the amount of memory
used, and when the application sends the data, lwip will do a second tx
buffer allocation and memcpy which means yet again there is momentarily
double the memory use.
In practice, there may not be any particular problem with having a
tcp_write_pbuf() variant - that's pretty much just moving existing code
around a little so hopefully wouldn't have any real repercussions for
normal users. But I wouldn't be happy about consolidating the pbuf memory
into a single pool in general.
There are ways of avoiding this second allocation and memcpy by using
tcp_sent, but it is not a very practical method since it requires the
application to keep track of exactly which data has been sent and acked.
I am afraid that I don't quite understand how using pbufs for both rx
and tx would use more memory than the separate rx/tx pools uses today.
Consider a more general TCP stream then you are using for your protocol.
There are few constraints on how much data can be enqueued, principally
TCP_SNDBUF and TCP_SNDQUEUELEN. So an application that has a lot of data
to send will be able to fill each tcp connection's send buffer entirely to
those limits. That would be done at the expense of rx buffers in your
scenario. That greatly risks deadlock.
So you might think then "well, why not just make sure TCP_SNDBUF and
TCP_SNDQUEUELEN" are set to prevent that, in which case you may as well
have used a separate tx buffer space, since you're again effectively
dividing up buffer space.
Anyway, I think if you can make a tcp_write_pbuf() implementation that
would not increase the footprint for those who don't use it, then feel
free to submit it to the patches page on savannah. If it doesn't increase
footprint, I'm sure that would be ok to accept (after 1.3.0). But it does
seem a little to me like the protocol you are implementing really should
be datagram-based, not stream-based.
Jifl
--
eCosCentric Limited http://www.eCosCentric.com/ The eCos experts
Barnwell House, Barnwell Drive, Cambridge, UK. Tel: +44 1223 245571
Registered in England and Wales: Reg No 4422071.
------["The best things in life aren't things."]------ Opinions==mine