|
From: | address@hidden |
Subject: | Re: [lwip-devel] SMEMCPY() |
Date: | Sun, 12 Jun 2011 14:23:04 +0200 |
User-agent: | Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; de; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10 |
Leon Woestenberg wrote:
What's the point in detailed performance measurements here? If I measured this on my platform, that doesn't say it's optimized equally on your platform, too.Who did performance measurements on this, before converting our code over?
As to the theoretical background:In our port, we are using a hand-written memcpy function (instead of the C library), which is optimized for copying big blocks and is much faster when copying unaligned data (the C library uses byte-copy if src or dst is not 32-bit aligned). However, this function has a large overhead:
- function call - check src and dst - unrolled loop checkingIn this case, I don't have to do performance measurements to know that defining SMEMCPY() to an inlined byte-copy routine is faster than calling our memcpy function optimized for large blocks. The optimized function ends up with doing the same byte-copy as the inline version does, so it adds the overhead of a call and argument checking.
Anyway, this change has been mimal-invasive by defining both MEMCPY() and SMEMCPY() to memcpy(), so I cannot see the downside of it.
Simon
[Prev in Thread] | Current Thread | [Next in Thread] |