lwip-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lwip-users] Out of memory in PCP_PCB pool after 2^32 milliseconds


From: Adam Baron
Subject: Re: [lwip-users] Out of memory in PCP_PCB pool after 2^32 milliseconds
Date: Mon, 31 May 2021 08:09:57 +0200

Hello Paul, all

found it, and this code shows it:
  next_timeout_time = (u32_t)(current_timeout_due_time + cyclic->interval_ms);  /* overflow handled by TIME_LESS_THAN macro */
  if (TIME_LESS_THAN(next_timeout_time, now)) { ...

Timer is increased over maximum sys_now(), which overflows at (2^32)/10.

There is a bug in the ChibiOS port sys_arch.c. I created a virtual timer that ticks every millisecond and changed sys_now(void) function to return this timer counter instead. Running it for a day now, with shifted 2^32 - 120 seconds start-up time, and it seems that it's solved.

I will prepare patches and try to post to ChibiOS.

Thank you all,
Adam

po 31. 5. 2021 v 6:39 odesílatel Matthias Paul <paul@focus-gmbh.com> napsal:

Hello Adam,

LWIP uses the TIME_LESS_THAN macro to handle integer overflows. Please have a look at the usage in core/timeouts.c in sys_check_timeouts():

/* Check if timer's expiry time is greater than time and care about u32_t wraparounds */
#define TIME_LESS_THAN(t, compare_to) ( (((u32_t)((t)-(compare_to))) > LWIP_MAX_TIMEOUT) ? 1 : 0 )

Paul


Am 28.05.2021 um 21:25 schrieb vysocan [via lwIP]:
Hello Trampas,

thanks for the hints. I initialized the sys ticks with 2^32 - 120 seconds, and I got mqtt pbuf=NULL in around 120 seconds + 120 keep alive seconds.

The ChibiOs sys_arch.c port includes sys_now() (current time in milliseconds) following simplified implementation:
  return ((u32_t)chVTGetSystemTimeX() - 1) / 10 + 1;
Since it ticks at 100 uS.

I guess it might cause the problems as it overflows back to 0 leaving the lwip timers waiting for value higher than (2^32)/10.

To support my guess, I turned on another debug option and last lwip timer message I see is:
sys_timeout: 2000C5DC abs_time=429497730 handler=ip_reass_tmr arg=805B28C


Adam

pá 28. 5. 2021 v 13:45 odesílatel Trampas Stern <[hidden email]> napsal:
Increase the counter to a uint64_t. 

You can also start the counter at something other than zero to prove root cause faster.

Trampas

On Fri, May 28, 2021 at 7:08 AM Adam Baron <[hidden email]> wrote:
Czesc Tomek :),

I'll try to add it. Thanks.

However, I feel like it is rather related to the problem of overflowing a uint32 counter of some kind. Since the TCP_PCBs are not freed after 2^32 ticks.

Adam

pá 28. 5. 2021 v 9:44 odesílatel Tomasz W <[hidden email]> napsal:
Hi (Cześć)
Lok for this https://lists.nongnu.org/archive/html/lwip-devel/2020-12/msg00014.html
In my case it solved the problem of the web server dying after a few days


pt., 28 maj 2021 o 08:58 Adam Baron <[hidden email]> napisał(a):
>
> Hello all,
>
> I'm having a small STM32F4 application running on devel branch of lwip, It includes httpd, sntp, smtp client, and mqtt client. All is running well until the fifth day, when mqtt client starts to receive pbuf=NULL and disconnects. My reconnect routine reconnects it in some short time, but it receives pbuf=NULL shortly after.
>
> Also later on I noticed in log: memp_malloc: out of memory in pool TCP_PCB.
> I'm having defined MEMP_NUM_TCP_PCB as 30 and it seems enough for normal operation, I also upped it to 50, but ended with the same problem
> In statistics the NUM_TCP_PCB increases and decreases as it should, but after uptime past 5 days it stays high with an error flag triggered.
>
> Quite interestingly it happens exactly after 2^32 milliseconds uptime. I tried to keep OpenOCD connected to start to peek in, but yet I did not manage to keep the openOCD running for so long without dropping the connection.
>
> Does anyone have any ideas please?
>
> Thanks in advance,
> --
> 731435556
> Adam Baron
> _______________________________________________
> lwip-users mailing list
> [hidden email]
> https://lists.nongnu.org/mailman/listinfo/lwip-users



--
Pozdrawiam
Tomek

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users


--
731435556
Adam Baron
_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users


--
731435556
Adam Baron

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users


If you reply to this email, your message will be added to the discussion below:
http://lwip.100.n7.nabble.com/Out-of-memory-in-PCP-PCB-pool-after-2-32-milliseconds-tp36460p36469.html
To unsubscribe from lwIP, click here.
NAML

  
_______________________________________________
lwip-users mailing list
lwip-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/lwip-users


--
731435556
Adam Baron

reply via email to

[Prev in Thread] Current Thread [Next in Thread]