qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 08/10] dp8393x: don't force 32-bit register access


From: Mark Cave-Ayland
Subject: Re: [PATCH v2 08/10] dp8393x: don't force 32-bit register access
Date: Sat, 3 Jul 2021 15:16:39 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0

On 03/07/2021 14:10, Philippe Mathieu-Daudé wrote:

On 7/3/21 2:04 PM, Mark Cave-Ayland wrote:
On 03/07/2021 09:52, Philippe Mathieu-Daudé wrote:
On 7/3/21 8:21 AM, Mark Cave-Ayland wrote:
On 02/07/2021 05:36, Finn Thain wrote:

On 6/25/21 8:53 AM, Mark Cave-Ayland wrote:
Commit 3fe9a838ec "dp8393x: Always use 32-bit accesses" assumed that
all accesses to the registers were 32-bit

No, that assumption was not made there. Just take a look at my
commits in
Linux that make 16-bit accesses. If commit 3fe9a838ec worked by
accident,
it probably just reflects my inadequate knowledge of QEMU internals.

but this is actually not the case. The access size is determined by
the CPU instruction used and not the number of physical address
lines.


I think that's an over-simplification (in the context of commit
3fe9a838ec).

Let me try and clarify this a bit more: there are 2 different changes
incorporated into 3fe9a838ec. The first (as you mention below and also
detailed in the commit messge), is related to handling writes to the
upper 16-bits of a word from the device and ensuring that 32-bit
accesses are handled correctly. This part seems correct to me based upon
reading the datasheet and the commit message:

@@ -246,9 +246,19 @@ static void dp8393x_put(dp8393xState *s, int width,
int offset,
                           uint16_t val)
   {
       if (s->big_endian) {
-        s->data[offset * width + width - 1] = cpu_to_be16(val);
+        if (width == 2) {
+            s->data[offset * 2] = 0;
+            s->data[offset * 2 + 1] = cpu_to_be16(val);
+        } else {
+            s->data[offset] = cpu_to_be16(val);
+        }
       } else {
-        s->data[offset * width] = cpu_to_le16(val);
+        if (width == 2) {
+            s->data[offset * 2] = cpu_to_le16(val);
+            s->data[offset * 2 + 1] = 0;
+        } else {
+            s->data[offset] = cpu_to_le16(val);
+        }
       }
   }

The second change incorporated into 3fe9a838ec (and the one this patch
fixes) is a similar change made to the MMIO *register* accesses:

@@ -590,7 +600,7 @@ static uint64_t dp8393x_read(void *opaque, hwaddr
addr, unsigned int size)

       DPRINTF("read 0x%04x from reg %s\n", val, reg_names[reg]);

-    return val;
+    return s->big_endian ? val << 16 : val;
   }

and:

@@ -598,13 +608,14 @@ static void dp8393x_write(void *opaque, hwaddr
addr, uint64_t data,
   {
       dp8393xState *s = opaque;
       int reg = addr >> s->it_shift;
+    uint32_t val = s->big_endian ? data >> 16 : data;

This is not correct since the QEMU memory API handles any access size
and endian conversion before the MMIO access reaches the device. It is
this change which breaks the 32-bit accesses used by MacOS to read/write
the dp8393x registers because it applies an additional endian swap on
top of that already done by the memory API.

The big_endian workaround applied to the register read/writes was
actually caused by forcing the access size to 32-bit when the
guest OS
was using a 16-bit access. Since the registers are 16-bit then we can
simply set .impl.min_access to 2 and then the memory API will
automatically do the right thing for both 16-bit accesses used by
Linux and 32-bit accesses used by the MacOS toolbox ROM.

Hmm I'm not sure. This sounds to me like the "QEMU doesn't model
busses
so we end using kludge to hide bugs" pattern. Can you provide a QTest
(ideally) or a "-trace memory_region_ops_\*" log of your firmware
accessing the dp8393x please?

The DP83932 chip is highly configurable, so I'm not sure that the
behaviour of any given firmware would resolve the question.

Indeed, I don't think that will help much here. Phil, if you would still
like me to send you some traces after reading the explanation above then
do let me know.

I read it and still would have few traces. We can hand-write them too.

I'd like to add qtests for some read/write,16/32(A)==B.

Sigh. I was just about to attempt some traces when I realised looking at
the patch that I made a mistake, and that in order for the memory API to
automatically handle the 4 byte accesses as 2 x 2 byte accesses then
both .impl.min_access_size and .impl.max_access_size need to be set to 2
:(  The remainder of the patch is the same but dp8393x_ops now looks
like this:

static const MemoryRegionOps dp8393x_ops = {
     .read = dp8393x_read,
     .write = dp8393x_write,
     .impl.min_access_size = 2,
     .impl.max_access_size = 2,
     .endianness = DEVICE_NATIVE_ENDIAN,
};

Yes :) This is closer to what I wrote during my review:

-- >8 --
diff --git a/hw/net/dp8393x.c b/hw/net/dp8393x.c
index bff359ed13f..6061268dc49 100644
--- a/hw/net/dp8393x.c
+++ b/hw/net/dp8393x.c
@@ -697,7 +697,9 @@ static const MemoryRegionOps dp8393x_ops = {
      .read = dp8393x_read,
      .write = dp8393x_write,
      .impl.min_access_size = 2,
-    .impl.max_access_size = 4,
+    .impl.max_access_size = 2,
+    .valid.min_access_size = 2,
+    .valid.max_access_size = 4,
      .endianness = DEVICE_NATIVE_ENDIAN,
  };
---

Yes, that should work. Sorry I got confused during the initial review.

I've tested this under Linux/m68k, NetBSD/arc and MacOS and networking
seems fine after a quick test in each OS. The slight curiosity is that
the 4 byte accesses used by MacOS are split into 2 x 2 byte accesses,
but since MacOS only uses the bottom 16-bit of the result and ignores
the top 16-bits then despite there being 2 accesses, everything still
works as expected (see below).


READ:

dp8393x_read reg=0x28 [SR] val=0x0004 size=2
memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a0 value 0x4
size 2
dp8393x_read reg=0x28 [SR] val=0x0004 size=2
memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a2 value 0x4
size 2
memory_region_ops_read cpu 0 mr 0x55f210c9c330 addr 0x50f0a0a0 value
0x40004 size 4

WRITE:

memory_region_ops_write cpu 0 mr 0x55f210c9c330 addr 0x50f0a01c value
0x248fe8 size 4
memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01c value
0x24 size 2
dp8393x_write reg=0x7 [CTDA] val=0x0024 size=2
memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01e value
0x8fe8 size 2
dp8393x_write reg=0x7 [CTDA] val=0x8fe8 size=2


If you're happy with this, I'll resubmit a revised version of the patch
but with an updated commit message: the Fixes: tag is still the same,
but the updated fix is to ensure that .impl.min_access_size and
.impl.max_access_size match the real-life 16-bit register size.

Hold on a bit more, I'm still reviewing the datasheet ;)

My code note so far (untested) are:

-- >8 --
@@ -219,34 +225,48 @@ static uint32_t dp8393x_wt(dp8393xState *s)
      return s->regs[SONIC_WT1] << 16 | s->regs[SONIC_WT0];
  }

-static uint16_t dp8393x_get(dp8393xState *s, int width, int offset)
+static uint16_t dp8393x_get(dp8393xState *s, hwaddr addr, unsigned ofs16)
  {
+    const MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED;
      uint16_t val;

-    if (s->big_endian) {
-        val = be16_to_cpu(s->data[offset * width + width - 1]);
+    if (s->regs[SONIC_DCR] & SONIC_DCR_DW) {
+        addr += 2 * ofs16;
+        if (s->big_endian) {
+            val = address_space_ldl_be(&s->as, addr, attrs, NULL);
+        } else {
+            val = address_space_ldl_le(&s->as, addr, attrs, NULL);
+        }
      } else {
-        val = le16_to_cpu(s->data[offset * width]);
+        addr += 1 * ofs16;
+        if (s->big_endian) {
+            val = address_space_lduw_be(&s->as, addr, attrs, NULL);
+        } else {
+            val = address_space_lduw_le(&s->as, addr, attrs, NULL);
+        }
      }
+
      return val;
  }

-static void dp8393x_put(dp8393xState *s, int width, int offset,
-                        uint16_t val)
+static void dp8393x_put(dp8393xState *s,
+                        hwaddr addr, unsigned ofs16, uint16_t val)
  {
-    if (s->big_endian) {
-        if (width == 2) {
-            s->data[offset * 2] = 0;
-            s->data[offset * 2 + 1] = cpu_to_be16(val);
+    const MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED;
+
+    if (s->regs[SONIC_DCR] & SONIC_DCR_DW) {
+        addr += 2 * ofs16;
+        if (s->big_endian) {
+            address_space_stl_be(&s->as, addr, val, attrs, NULL);
          } else {
-            s->data[offset] = cpu_to_be16(val);
+            address_space_stl_le(&s->as, addr, val, attrs, NULL);
          }
      } else {
-        if (width == 2) {
-            s->data[offset * 2] = cpu_to_le16(val);
-            s->data[offset * 2 + 1] = 0;
+        addr += 1 * ofs16;
+        if (s->big_endian) {
+            address_space_stw_be(&s->as, addr, val, attrs, NULL);
          } else {
-            s->data[offset] = cpu_to_le16(val);
+            address_space_stw_le(&s->as, addr, val, attrs, NULL);
          }
      }
  }
@@ -313,14 +331,12 @@ static void dp8393x_do_read_rra(dp8393xState *s)
      /* Read memory */
      width = (s->regs[SONIC_DCR] & SONIC_DCR_DW) ? 2 : 1;
      size = sizeof(uint16_t) * 4 * width;
-    address_space_read(&s->as, dp8393x_rrp(s),
-                       MEMTXATTRS_UNSPECIFIED, s->data, size);

      /* Update SONIC registers */
-    s->regs[SONIC_CRBA0] = dp8393x_get(s, width, 0);
-    s->regs[SONIC_CRBA1] = dp8393x_get(s, width, 1);
-    s->regs[SONIC_RBWC0] = dp8393x_get(s, width, 2);
-    s->regs[SONIC_RBWC1] = dp8393x_get(s, width, 3);
+    s->regs[SONIC_CRBA0] = dp8393x_get(s, dp8393x_rrp(s), 0);
+    s->regs[SONIC_CRBA1] = dp8393x_get(s, dp8393x_rrp(s), 1);
+    s->regs[SONIC_RBWC0] = dp8393x_get(s, dp8393x_rrp(s), 2);
+    s->regs[SONIC_RBWC1] = dp8393x_get(s, dp8393x_rrp(s), 3);
      trace_dp8393x_read_rra_regs(s->regs[SONIC_CRBA0], s->regs[SONIC_CRBA1],
                                  s->regs[SONIC_RBWC0],
s->regs[SONIC_RBWC1]);

@@ -416,28 +432,22 @@ static void
dp8393x_do_receiver_disable(dp8393xState *s)
  static void dp8393x_do_transmit_packets(dp8393xState *s)
  {
      NetClientState *nc = qemu_get_queue(s->nic);
-    int width, size;
      int tx_len, len;
      uint16_t i;

-    width = (s->regs[SONIC_DCR] & SONIC_DCR_DW) ? 2 : 1;
-
      while (1) {
          /* Read memory */
-        size = sizeof(uint16_t) * 6 * width;
          s->regs[SONIC_TTDA] = s->regs[SONIC_CTDA];
          trace_dp8393x_transmit_packet(dp8393x_ttda(s));
-        address_space_read(&s->as, dp8393x_ttda(s) + sizeof(uint16_t) *
width,
-                           MEMTXATTRS_UNSPECIFIED, s->data, size);
          tx_len = 0;

          /* Update registers */
-        s->regs[SONIC_TCR] = dp8393x_get(s, width, 0) & 0xf000;
-        s->regs[SONIC_TPS] = dp8393x_get(s, width, 1);
-        s->regs[SONIC_TFC] = dp8393x_get(s, width, 2);
-        s->regs[SONIC_TSA0] = dp8393x_get(s, width, 3);
-        s->regs[SONIC_TSA1] = dp8393x_get(s, width, 4);
-        s->regs[SONIC_TFS] = dp8393x_get(s, width, 5);
+        s->regs[SONIC_TCR] = dp8393x_get(s, dp8393x_ttda(s), 0) & 0xf000;
+        s->regs[SONIC_TPS] = dp8393x_get(s, dp8393x_ttda(s), 1);
+        s->regs[SONIC_TFC] = dp8393x_get(s, dp8393x_ttda(s), 2);
+        s->regs[SONIC_TSA0] = dp8393x_get(s, dp8393x_ttda(s), 3);
+        s->regs[SONIC_TSA1] = dp8393x_get(s, dp8393x_ttda(s), 4);
+        s->regs[SONIC_TFS] = dp8393x_get(s, dp8393x_ttda(s), 5);

          /* Handle programmable interrupt */
          if (s->regs[SONIC_TCR] & SONIC_TCR_PINT) {
@@ -459,15 +469,9 @@ static void
dp8393x_do_transmit_packets(dp8393xState *s)
              i++;
              if (i != s->regs[SONIC_TFC]) {
                  /* Read next fragment details */
-                size = sizeof(uint16_t) * 3 * width;
-                address_space_read(&s->as,
-                                   dp8393x_ttda(s)
-                                   + sizeof(uint16_t) * width * (4 + 3
* i),
-                                   MEMTXATTRS_UNSPECIFIED, s->data,
-                                   size);
-                s->regs[SONIC_TSA0] = dp8393x_get(s, width, 0);
-                s->regs[SONIC_TSA1] = dp8393x_get(s, width, 1);
-                s->regs[SONIC_TFS] = dp8393x_get(s, width, 2);
+                s->regs[SONIC_TSA0] = dp8393x_get(s, dp8393x_ttda(s), 0);
+                s->regs[SONIC_TSA1] = dp8393x_get(s, dp8393x_ttda(s), 1);
+                s->regs[SONIC_TFS] = dp8393x_get(s, dp8393x_ttda(s), 2);
              }
          }

@@ -500,22 +504,12 @@ static void
dp8393x_do_transmit_packets(dp8393xState *s)
          s->regs[SONIC_TCR] |= SONIC_TCR_PTX;

          /* Write status */
-        dp8393x_put(s, width, 0,
-                    s->regs[SONIC_TCR] & 0x0fff); /* status */
-        size = sizeof(uint16_t) * width;
-        address_space_write(&s->as, dp8393x_ttda(s),
-                            MEMTXATTRS_UNSPECIFIED, s->data, size);
+        dp8393x_put(s, dp8393x_ttda(s), 0, s->regs[SONIC_TCR] & 0x0fff);

          if (!(s->regs[SONIC_CR] & SONIC_CR_HTX)) {
              /* Read footer of packet */
-            size = sizeof(uint16_t) * width;
-            address_space_read(&s->as,
-                               dp8393x_ttda(s)
-                               + sizeof(uint16_t) * width
-                                 * (4 + 3 * s->regs[SONIC_TFC]),
-                               MEMTXATTRS_UNSPECIFIED, s->data,
-                               size);
-            s->regs[SONIC_CTDA] = dp8393x_get(s, width, 0);
+            s->regs[SONIC_CTDA] = dp8393x_get(s, dp8393x_ttda(s),
+                                               s->regs[SONIC_TFC]);
              if (s->regs[SONIC_CTDA] & SONIC_DESC_EOL) {
                  /* EOL detected */
                  break;
@@ -764,7 +759,7 @@ static ssize_t dp8393x_receive(NetClientState *nc,
const uint8_t * buf,
      dp8393xState *s = qemu_get_nic_opaque(nc);
      int packet_type;
      uint32_t available, address;
-    int width, rx_len, padded_len;
+    int rx_len, padded_len;
      uint32_t checksum;
      int size;

@@ -777,10 +772,8 @@ static ssize_t dp8393x_receive(NetClientState *nc,
const uint8_t * buf,

      rx_len = pkt_size + sizeof(checksum);
      if (s->regs[SONIC_DCR] & SONIC_DCR_DW) {
-        width = 2;
          padded_len = ((rx_len - 1) | 3) + 1;
      } else {
-        width = 1;
          padded_len = ((rx_len - 1) | 1) + 1;
      }

@@ -801,11 +794,7 @@ static ssize_t dp8393x_receive(NetClientState *nc,
const uint8_t * buf,
      /* Check for EOL */
      if (s->regs[SONIC_LLFA] & SONIC_DESC_EOL) {
          /* Are we still in resource exhaustion? */
-        size = sizeof(uint16_t) * 1 * width;
-        address = dp8393x_crda(s) + sizeof(uint16_t) * 5 * width;
-        address_space_read(&s->as, address, MEMTXATTRS_UNSPECIFIED,
-                           s->data, size);
-        s->regs[SONIC_LLFA] = dp8393x_get(s, width, 0);
+        s->regs[SONIC_LLFA] = dp8393x_get(s, dp8393x_crda(s), 5);
          if (s->regs[SONIC_LLFA] & SONIC_DESC_EOL) {
              /* Still EOL ; stop reception */
              return -1;
@@ -813,11 +802,7 @@ static ssize_t dp8393x_receive(NetClientState *nc,
const uint8_t * buf,
          /* Link has been updated by host */

          /* Clear in_use */
-        size = sizeof(uint16_t) * width;
-        address = dp8393x_crda(s) + sizeof(uint16_t) * 6 * width;
-        dp8393x_put(s, width, 0, 0);
-        address_space_rw(&s->as, address, MEMTXATTRS_UNSPECIFIED,
-                         (uint8_t *)s->data, size, 1);
+        dp8393x_put(s, dp8393x_crda(s), 6, 0x0000);

          /* Move to next descriptor */
          s->regs[SONIC_CRDA] = s->regs[SONIC_LLFA];
@@ -846,8 +831,8 @@ static ssize_t dp8393x_receive(NetClientState *nc,
const uint8_t * buf,
      /* Pad short packets to keep pointers aligned */
      if (rx_len < padded_len) {
          size = padded_len - rx_len;
-        address_space_rw(&s->as, address, MEMTXATTRS_UNSPECIFIED,
-            (uint8_t *)"\xFF\xFF\xFF", size, 1);
+        address_space_write(&s->as, address, MEMTXATTRS_UNSPECIFIED,
+                            (uint8_t *)"\xFF\xFF\xFF", size);
          address += size;
      }

@@ -871,32 +856,20 @@ static ssize_t dp8393x_receive(NetClientState *nc,
const uint8_t * buf,

      /* Write status to memory */
      trace_dp8393x_receive_write_status(dp8393x_crda(s));
-    dp8393x_put(s, width, 0, s->regs[SONIC_RCR]); /* status */
-    dp8393x_put(s, width, 1, rx_len); /* byte count */
-    dp8393x_put(s, width, 2, s->regs[SONIC_TRBA0]); /* pkt_ptr0 */
-    dp8393x_put(s, width, 3, s->regs[SONIC_TRBA1]); /* pkt_ptr1 */
-    dp8393x_put(s, width, 4, s->regs[SONIC_RSC]); /* seq_no */
-    size = sizeof(uint16_t) * 5 * width;
-    address_space_write(&s->as, dp8393x_crda(s),
-                        MEMTXATTRS_UNSPECIFIED,
-                        s->data, size);
+    dp8393x_put(s, dp8393x_crda(s), 0, s->regs[SONIC_RCR]); /* status */
+    dp8393x_put(s, dp8393x_crda(s), 1, rx_len); /* byte count */
+    dp8393x_put(s, dp8393x_crda(s), 2, s->regs[SONIC_TRBA0]); /*
pkt_ptr0 */
+    dp8393x_put(s, dp8393x_crda(s), 3, s->regs[SONIC_TRBA1]); /*
pkt_ptr1 */
+    dp8393x_put(s, dp8393x_crda(s), 4, s->regs[SONIC_RSC]); /* seq_no */

      /* Check link field */
-    size = sizeof(uint16_t) * width;
-    address_space_read(&s->as,
-                       dp8393x_crda(s) + sizeof(uint16_t) * 5 * width,
-                       MEMTXATTRS_UNSPECIFIED, s->data, size);
-    s->regs[SONIC_LLFA] = dp8393x_get(s, width, 0);
+    s->regs[SONIC_LLFA] = dp8393x_get(s, dp8393x_crda(s), 5);
      if (s->regs[SONIC_LLFA] & SONIC_DESC_EOL) {
          /* EOL detected */
          s->regs[SONIC_ISR] |= SONIC_ISR_RDE;
      } else {
          /* Clear in_use */
-        size = sizeof(uint16_t) * width;
-        address = dp8393x_crda(s) + sizeof(uint16_t) * 6 * width;
-        dp8393x_put(s, width, 0, 0);
-        address_space_write(&s->as, address, MEMTXATTRS_UNSPECIFIED,
-                            s->data, size);
+        dp8393x_put(s, dp8393x_crda(s), 6, 0x0000);

          /* Move to next descriptor */
          s->regs[SONIC_CRDA] = s->regs[SONIC_LLFA];
---

Now I'll look at the CAM then the cksum.

I see, so you're also looking to consolidate all the address space accesses via the dp8393x_get() and dp8393x_put() functions instead of using s->data as an intermediatery - that makes sense to me. Once you have a patch that works for MIPS let me know and I can give a test on Linux/m68k and MacOS :)


ATB,

Mark.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]