[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lwip-devel] [bug #61558] Possibly faulty behaviour on receiving large p

Subject: [lwip-devel] [bug #61558] Possibly faulty behaviour on receiving large payloads with mqtt+tls causing disconnect
Date: Fri, 26 Nov 2021 08:15:13 -0500 (EST)
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.55 Safari/537.36 Edg/96.0.1054.34

Follow-up Comment #3, bug #61558 (project lwip):

Thank you for your quick reply!

I can unfortunatly not provide any unit test since we do no unit test this
part of the code. 

We recently upgraded the maximum publish payload size for mqtt  of our product
to 1024 bytes, and when trying to also receive this payload size we noticed
that disconnect was triggered. 
Our PBUF_POOL_BUFSIZE is 312 bytes. We need to keep it that size for
performance reasons. Other config items that might be interesting is;

We use lwip for tcp, mqtt and as tls interface to mbedtls. 
I will try to explain what happens: 
Disconnect is being triggered in mqtt.c

      topic_len = var_hdr_payload[0];
      topic_len = (topic_len << 8) + (u16_t)(var_hdr_payload[1]); //The first
bytes should be variable header but is just payload bytes.
      if ((topic_len > length - (2 + qos_len)) ||
          (topic_len > var_hdr_payload_bufsize - (2 + qos_len))) { //Since the
payload is interpreted as "topic_len" this value is often too large
        LWIP_DEBUGF(MQTT_DEBUG_WARN,( "mqtt_message_received: Received short
PUBLISH packet (topic)\n"));
        goto out_disconnect;  //Disconnect is triggered here

When looking in to the the actual payload I could see that this message has no
variable header, it is just plain payload. Tracing this back to 
I the pbuf comming in to this function has no header at all, it is just
consisting of payload. 

I have traced the issue further back to 

By debugging and reading the code I could conlude that there is a loop that
allocs one pbuf at a time and fills this with decrypted data 
from the func "mbedtls_ssl_read()".
This pbuf is always directly passed on to the above application (in this case
mqtt) using "altcp_mbedtls_pass_rx_data()" and is never chained together with
another pbuf. 

This is true even if there is still data left that has not yet been pulled of
state->rx using "mbedtls_ssl_read()". 

So when the incomming data is larger than one pbuf size this remaining data is
being pulled of state->rx and sent on to the above application, without being
chained to the previous 
part of the data (which contains the header).

This is the part of the code that I mean;
in "altcp_mbedtls_handle_rx_appldata()"
        if (state->rx_app == NULL) {
          state->rx_app = buf;
        } else {
          pbuf_cat(state->rx_app, buf); //This is never called since state->rx
is always NULL every iteration. 
      } else {
        buf = NULL;
          err = altcp_mbedtls_pass_rx_data(conn, state); //When exiting this
function state->rx is NULL. 

In "altcp_mbedtls_pass_rx_data()"

 buf = state->rx_app;
  if (buf) {
    state->rx_app = NULL; //Here the state->rx is set to NULL 
    if (conn->recv) {
      u16_t tot_len = buf->tot_len;
      /* this needs to be increased first because the 'recved' call may come
nested */
      state->rx_passed_unrecved += tot_len;
      err = conn->recv(conn->arg, conn, buf, ERR_OK);
      if (err != ERR_OK) {
        if (err == ERR_ABRT) {
          return ERR_ABRT;
        /* not received, leave the pbuf(s) queued (and decrease 'unrecved'
again) */
        LWIP_ASSERT("state == conn->state", state == conn->state);
        state->rx_app = buf;                            //We never get here 
since mqtt_recv_cb()
always return ERR_OK. state->rx is still NULL 
        state->rx_passed_unrecved -= tot_len;
        LWIP_ASSERT("state->rx_passed_unrecved >= 0",
state->rx_passed_unrecved >= 0);
        if (state->rx_passed_unrecved < 0) {
          state->rx_passed_unrecved = 0;

This means state->rx is still NULL when exiting this function. 

This is what i can guess goes wrong but, I am not entirely sure how this is
supposed to work? Is the above application supposed to reject the pbuf comming
in from 
altcp_mbedtls_pass_rx_data if it is not a full packet including header, thus
propagating an error back to altcp_mbedtls_pass_rx_data?
Or could this be a bug? 

I have tried two fixes to verify that the issue has to do with pbuf chaining.
Increasing PBUF_POOL_BUFSIZE to 2048 bytes so that payload (we only need to
send 1000 bytes) is never
big enough to be fragmented in two pbufs. This fixed the issue, but ca not be
done permanently in our application.
The next fix I tested is;
Adding a condition to check if a full packet has been decoded before sending
the pbuf chain to the application like this; 

in "altcp_mbedtls_handle_rx_appldata()"

        if (state->rx_app == NULL) {
          state->rx_app = buf;
        } else {
          pbuf_cat(state->rx_app, buf); //Now the pbufs are chained here
before being sent to the appication
      } else {
        buf = NULL;
      /* patch start*/
      if (state->ssl_context.in_msglen == 0) { //state->ssl_context.in_msglen
would be decreased to 0 when all data is pulled of state->rx
          err = altcp_mbedtls_pass_rx_data(conn, state);
     }/* patch end*/

This fixed the issue, but broke tls over tcp for some reason that I have not
yet looked into. 

This confirms that the issue has something to do with the chaining of pbufs.

Is this more clear or do you need more information? 

Thank you for your help!


Reply to this item at:


  Message sent via Savannah

reply via email to

[Prev in Thread] Current Thread [Next in Thread]