[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[lwip-devel] [bug #61558] Possibly faulty behaviour on receiving large p
From: |
INVALID.NOREPLY |
Subject: |
[lwip-devel] [bug #61558] Possibly faulty behaviour on receiving large payloads with mqtt+tls causing disconnect |
Date: |
Fri, 26 Nov 2021 08:15:13 -0500 (EST) |
User-agent: |
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.55 Safari/537.36 Edg/96.0.1054.34 |
Follow-up Comment #3, bug #61558 (project lwip):
Thank you for your quick reply!
I can unfortunatly not provide any unit test since we do no unit test this
part of the code.
We recently upgraded the maximum publish payload size for mqtt of our product
to 1024 bytes, and when trying to also receive this payload size we noticed
that disconnect was triggered.
Our PBUF_POOL_BUFSIZE is 312 bytes. We need to keep it that size for
performance reasons. Other config items that might be interesting is;
MQTT_VAR_HEADER_BUFFER_LEN 1024
MQTT_OUTPUT_RINGBUF_SIZE 1024
We use lwip for tcp, mqtt and as tls interface to mbedtls.
I will try to explain what happens:
Disconnect is being triggered in mqtt.c
"mqtt_message_received()"
topic_len = var_hdr_payload[0];
topic_len = (topic_len << 8) + (u16_t)(var_hdr_payload[1]); //The first
bytes should be variable header but is just payload bytes.
if ((topic_len > length - (2 + qos_len)) ||
(topic_len > var_hdr_payload_bufsize - (2 + qos_len))) { //Since the
payload is interpreted as "topic_len" this value is often too large
LWIP_DEBUGF(MQTT_DEBUG_WARN,( "mqtt_message_received: Received short
PUBLISH packet (topic)\n"));
goto out_disconnect; //Disconnect is triggered here
}
When looking in to the the actual payload I could see that this message has no
variable header, it is just plain payload. Tracing this back to
"mqtt_parse_incoming()"
I the pbuf comming in to this function has no header at all, it is just
consisting of payload.
I have traced the issue further back to
"altcp_mbedtls_handle_rx_appldata()"
By debugging and reading the code I could conlude that there is a loop that
allocs one pbuf at a time and fills this with decrypted data
from the func "mbedtls_ssl_read()".
This pbuf is always directly passed on to the above application (in this case
mqtt) using "altcp_mbedtls_pass_rx_data()" and is never chained together with
another pbuf.
This is true even if there is still data left that has not yet been pulled of
state->rx using "mbedtls_ssl_read()".
So when the incomming data is larger than one pbuf size this remaining data is
being pulled of state->rx and sent on to the above application, without being
chained to the previous
part of the data (which contains the header).
This is the part of the code that I mean;
in "altcp_mbedtls_handle_rx_appldata()"
Loop:
if (state->rx_app == NULL) {
state->rx_app = buf;
} else {
pbuf_cat(state->rx_app, buf); //This is never called since state->rx
is always NULL every iteration.
}
} else {
pbuf_free(buf);
buf = NULL;
}
err = altcp_mbedtls_pass_rx_data(conn, state); //When exiting this
function state->rx is NULL.
In "altcp_mbedtls_pass_rx_data()"
buf = state->rx_app;
if (buf) {
state->rx_app = NULL; //Here the state->rx is set to NULL
if (conn->recv) {
u16_t tot_len = buf->tot_len;
/* this needs to be increased first because the 'recved' call may come
nested */
state->rx_passed_unrecved += tot_len;
state->flags |= ALTCP_MBEDTLS_FLAGS_UPPER_CALLED;
err = conn->recv(conn->arg, conn, buf, ERR_OK);
if (err != ERR_OK) {
if (err == ERR_ABRT) {
return ERR_ABRT;
}
/* not received, leave the pbuf(s) queued (and decrease 'unrecved'
again) */
LWIP_ASSERT("state == conn->state", state == conn->state);
state->rx_app = buf; //We never get here
since mqtt_recv_cb()
always return ERR_OK. state->rx is still NULL
state->rx_passed_unrecved -= tot_len;
LWIP_ASSERT("state->rx_passed_unrecved >= 0",
state->rx_passed_unrecved >= 0);
if (state->rx_passed_unrecved < 0) {
state->rx_passed_unrecved = 0;
}
This means state->rx is still NULL when exiting this function.
This is what i can guess goes wrong but, I am not entirely sure how this is
supposed to work? Is the above application supposed to reject the pbuf comming
in from
altcp_mbedtls_pass_rx_data if it is not a full packet including header, thus
propagating an error back to altcp_mbedtls_pass_rx_data?
Or could this be a bug?
I have tried two fixes to verify that the issue has to do with pbuf chaining.
Increasing PBUF_POOL_BUFSIZE to 2048 bytes so that payload (we only need to
send 1000 bytes) is never
big enough to be fragmented in two pbufs. This fixed the issue, but ca not be
done permanently in our application.
The next fix I tested is;
Adding a condition to check if a full packet has been decoded before sending
the pbuf chain to the application like this;
in "altcp_mbedtls_handle_rx_appldata()"
if (state->rx_app == NULL) {
state->rx_app = buf;
} else {
pbuf_cat(state->rx_app, buf); //Now the pbufs are chained here
before being sent to the appication
}
} else {
pbuf_free(buf);
buf = NULL;
}
/* patch start*/
if (state->ssl_context.in_msglen == 0) { //state->ssl_context.in_msglen
would be decreased to 0 when all data is pulled of state->rx
err = altcp_mbedtls_pass_rx_data(conn, state);
}/* patch end*/
This fixed the issue, but broke tls over tcp for some reason that I have not
yet looked into.
This confirms that the issue has something to do with the chaining of pbufs.
Is this more clear or do you need more information?
Thank you for your help!
_______________________________________________________
Reply to this item at:
<https://savannah.nongnu.org/bugs/?61558>
_______________________________________________
Message sent via Savannah
https://savannah.nongnu.org/
- [lwip-devel] [bug #61558] Possibly faulty behaviour on receiving large payloads with mqtt+tls causing disconnect, Tove Rumar, 2021/11/25
- [lwip-devel] [bug #61558] Possibly faulty behaviour on receiving large payloads with mqtt+tls causing disconnect, Tove Rumar, 2021/11/26
- [lwip-devel] [bug #61558] Possibly faulty behaviour on receiving large payloads with mqtt+tls causing disconnect, Simon Goldschmidt, 2021/11/26
- [lwip-devel] [bug #61558] Possibly faulty behaviour on receiving large payloads with mqtt+tls causing disconnect,
INVALID.NOREPLY <=
- [lwip-devel] [bug #61558] Possibly faulty behaviour on receiving large payloads with mqtt+tls causing disconnect, Simon Goldschmidt, 2021/11/26
- [lwip-devel] [bug #61558] Possibly faulty behaviour on receiving large payloads with mqtt+tls causing disconnect, INVALID.NOREPLY, 2021/11/26
- [lwip-devel] [bug #61558] Possibly faulty behaviour on receiving large payloads with mqtt+tls causing disconnect, INVALID.NOREPLY, 2021/11/26
- [lwip-devel] [bug #61558] Possibly faulty behaviour on receiving large payloads with mqtt+tls causing disconnect, Simon Goldschmidt, 2021/11/26
- [lwip-devel] [bug #61558] Possibly faulty behaviour on receiving large payloads with mqtt+tls causing disconnect, INVALID.NOREPLY, 2021/11/29
- [lwip-devel] [bug #61558] Possibly faulty behaviour on receiving large payloads with mqtt+tls causing disconnect, Simon Goldschmidt, 2021/11/29
- [lwip-devel] [bug #61558] Possibly faulty behaviour on receiving large payloads with mqtt+tls causing disconnect, INVALID.NOREPLY, 2021/11/29