lwip-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lwip-users] Bug in pbuf.c regarding PBUF_POOL


From: Jonathan Larmour
Subject: Re: [lwip-users] Bug in pbuf.c regarding PBUF_POOL
Date: Fri, 20 Oct 2006 23:37:23 +0100
User-agent: Mozilla Thunderbird 1.0.8-1.1.fc3.4.legacy (X11/20060515)

[I saw Simon's recent mail but thought it was better replying to this one...]
Goldschmidt Simon wrote:
Hi,

1:
Sorry to be so 'unprofessional' :)
But:
The 1.1.0 version I was using was provided by Altera with the NIOS2
devkit. I just compared this version to the oringinal 1.1.0 and saw that
Altera changed that part... They simply just used the semaphore used by
PBUF_POOL_FREE() in pbuf_pool_alloc() as well. I can submit a patch
doing this if anyone cares...

I'm relatively new at lwIP and haven't done much with it. But this is my take from what I've seen so far....

I don't think you can portably wait on a semaphore in pbuf_pool_alloc which is probably why it isn't. You might be (and probably will be) in a device driver context and so cannot block.

2:
The other question is: Why did nobody get corrupted pbufs like I do?
Have you all set SYS_LIGHTWEIGHT_PROT to 1? (since it's by default set
to 0...)

I don't know the history, but thread-safety is almost certainly why SYS_LIGHTWEIGHT_PROT even exists. Without it, the code is inherently *not* thread safe.

3:
OK, maybe the stack is not supposed to be multithreaded, BUT: I have a
separate thread for receiveing,

As does SLIP and PPP, so this would be a wider stack issue.

and that thread must allocate fresh
pbufs (PBUF_POOL in my case) so where should I do that if not in a
seperate thread? With SYS_LIGHTWEIGHT_PROT=1, INTs are diables (=
threadsafe); in PBUF_POOL_FREE(), a semaphore is used (=threadsafe)

With SYS_LIGHTWEIGHT_PROT=1, semaphores are not used in PBUF_POOL_FREE(). Just SYS_ARCH_PROTECT() etc.

, so
I still think the design goal is to make pbufs and memory access thread
safe!

4:
The reason pbuf_pool_free_lock is never set to anything except 0 is that
it is not used on the FREE-side. What strikes me more is, that
pbuf_pool_alloc_lock is never tested against, only set to a value. But
that seems only to be a typo (in v1.1.1, pbuf.c line 146 should use
pbuf_pool_alloc_lock instead of pbuf_pool_free_lock).

Although there is a problem here, I don't believe that is a typo. I believe pbuf_pool_alloc was originally written expecting to be called from drivers potentially in an interrupt and can't block. I believe that is the expected model and is probably an important thing to preserve. And pbuf_pool_free is only expected to be called from thread-level stack code.

So with that model an alloc doesn't have to worry about interrupting another alloc, only a free. And if it finds it is interrupting a free, it should just bomb out and return no buffer.

But first of all, indeed nothing increments pbuf_pool_free_lock. And second of all, there are calls to pbuf_alloc for allocations from PBUF_POOL in other places that may themselves then be interrupted by a driver interrupt:

$ egrep -r 'pbuf_alloc.*POOL' .
./core/ipv4/ip_frag.c:      p = pbuf_alloc(PBUF_LINK, ip_reasslen, PBUF_POOL);
./core/pbuf.c:        q = pbuf_alloc(PBUF_RAW, p->len, PBUF_POOL);
./netif/ppp/ppp.c:      tb = pbuf_alloc(PBUF_RAW, 0, PBUF_POOL);
./netif/ppp/ppp.c:      headMB = pbuf_alloc(PBUF_RAW, 0, PBUF_POOL);
./netif/ppp/ppp.c:      headMB = pbuf_alloc(PBUF_RAW, 0, PBUF_POOL);
./netif/ppp/ppp.c: nextNBuf = pbuf_alloc(PBUF_RAW, 0, PBUF_POOL); ./netif/ppp/vj.c: np = pbuf_alloc(PBUF_RAW, n0->len + cs->cs_hlen, PBUF_POOL); ./netif/ppp/vj.c: np = pbuf_alloc(PBUF_RAW, cs->cs_hlen, PBUF_POOL); ./netif/slipif.c: p = pbuf_alloc(PBUF_LINK, PBUF_POOL_BUFSIZE, PBUF_POOL);


So the (likely) intended form of protection is bogus. You can't shield an allocation from another allocation. As both would set pbuf_pool_alloc_lock to 1 and you couldn't tell which one "won". If you instead tries to do "pbuf_pool_alloc_lock++" that wouldn't help as that is not guaranteed atomic (and usually isn't).

The only solutions I can think of without !SYS_LIGHTWEIGHT_PROT would involve adding new elements to the sys_arch abstraction. In which case you may as well just implement SYS_LIGHWEIGHT_PROT.

With the current code, I think the only solution is SYS_LIGHTWEIGHT_PROT. My opinion is that the !SYS_LIGHTWEIGHT_PROT stuff should probably just be removed.

Unless someone knows that the above allocs could avoid using PBUF_POOL. Why they're being allocated from there is a bit of a mystery to me.

Then again the above might be complete rubbish.

Jifl
--
eCosCentric    http://www.eCosCentric.com/    The eCos and RedBoot experts
------["The best things in life aren't things."]------      Opinions==mine




reply via email to

[Prev in Thread] Current Thread [Next in Thread]