[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lwip-devel] in-place overwriting of payload via static "tcphdr" pointer

From: Bernhart Pelger
Subject: [lwip-devel] in-place overwriting of payload via static "tcphdr" pointer.
Date: Fri, 22 Dec 2017 14:44:38 +0100
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0

I've traced some data corruption due to a race condition
with the lwip library provided by Xilinx for ZYNQ (EMACPS).

In tcp_in.c: tcp_input() the "tcphdr" is patched
with byte-flipped versions of 'src', 'dest', 'seqno',
'ackno', 'flags', ....

This write access causes data corruption via the
following race condition:
  1. The EMACPS receives some packets and writes them
      via DMA to a ring of buffers (buffer descriptor ring).
  2. On a new packet, Xilinx' Interrupt handling gets active:
    3. detects that the Interrupt is from the EMACPS (GIC handler)
    4. detects that it is a RX interrupt (emacps_recv_handler())
    5. Invalidates the cache for the packet payload.
       This also voids any outstanding writes in the Cache.
    6. prepares a pbuf with it's payload pointing rx packet memory,
       and adds it to a queue to be processed by the LWIP stack.
   7. LWIP dequeues the received pbuf and processes it.
     8. in tcp_in.c, a few fields of tcp header in
        the pbuf payload get overwritten.

The problem is that in step 8, the memory cache gets dirty
with an outstanding write transaction to DDR Memory in the TCP
header area. At some point in time, ring buffer wraps around
and the EMACPS writes a new packet to the same DDR memory location
via DMA.

Now we still have an outstanding write transaction with an old
tcp header and new data in DDR3 memory, but the RX ISR has not yet
invalidated the cache for this location. If the DDR3 cache
gets flushed during this time period (between steps 2,3,4),
the newly received packet will be corrupted with the old tcp
header with old port numbers that are already byte-flipped.
Also if the cache line is 32 bytes wide, some neighbour bytes will also
get corrupted in the IP header and the TCP payload.
The tcp_input() function will then byte-flip the (old)
port numbers again, and will later discard the packet
due to invalid port number.

The most elegant solution would be if the LWIP TCP stack would
not write to the RX packet buffer - that's not required anyways
just for byte-flipping some tcp header values.
These values could be be left as-is and be byte-flipped only
when they are actually read.

Other solutions would be to require the driver to copy
the complete packet before handing it to LWIP - that
would however impact performance.

I tested this with the LWIP library that was included in
Xilinx SDK 2014.4 which is lwip 1.4.0 -
however the latest 2.0.3 also seems to modify tcp packets.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]