[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 08/10] dp8393x: don't force 32-bit register access
From: |
Philippe Mathieu-Daudé |
Subject: |
Re: [PATCH v2 08/10] dp8393x: don't force 32-bit register access |
Date: |
Sat, 3 Jul 2021 15:10:46 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 |
On 7/3/21 2:04 PM, Mark Cave-Ayland wrote:
> On 03/07/2021 09:52, Philippe Mathieu-Daudé wrote:
>> On 7/3/21 8:21 AM, Mark Cave-Ayland wrote:
>>> On 02/07/2021 05:36, Finn Thain wrote:
>>>
>>>>> On 6/25/21 8:53 AM, Mark Cave-Ayland wrote:
>>>>>> Commit 3fe9a838ec "dp8393x: Always use 32-bit accesses" assumed that
>>>>>> all accesses to the registers were 32-bit
>>>>
>>>> No, that assumption was not made there. Just take a look at my
>>>> commits in
>>>> Linux that make 16-bit accesses. If commit 3fe9a838ec worked by
>>>> accident,
>>>> it probably just reflects my inadequate knowledge of QEMU internals.
>>>>
>>>>>> but this is actually not the case. The access size is determined by
>>>>>> the CPU instruction used and not the number of physical address
>>>>>> lines.
>>>>>>
>>>>
>>>> I think that's an over-simplification (in the context of commit
>>>> 3fe9a838ec).
>>>
>>> Let me try and clarify this a bit more: there are 2 different changes
>>> incorporated into 3fe9a838ec. The first (as you mention below and also
>>> detailed in the commit messge), is related to handling writes to the
>>> upper 16-bits of a word from the device and ensuring that 32-bit
>>> accesses are handled correctly. This part seems correct to me based upon
>>> reading the datasheet and the commit message:
>>>
>>> @@ -246,9 +246,19 @@ static void dp8393x_put(dp8393xState *s, int width,
>>> int offset,
>>> uint16_t val)
>>> {
>>> if (s->big_endian) {
>>> - s->data[offset * width + width - 1] = cpu_to_be16(val);
>>> + if (width == 2) {
>>> + s->data[offset * 2] = 0;
>>> + s->data[offset * 2 + 1] = cpu_to_be16(val);
>>> + } else {
>>> + s->data[offset] = cpu_to_be16(val);
>>> + }
>>> } else {
>>> - s->data[offset * width] = cpu_to_le16(val);
>>> + if (width == 2) {
>>> + s->data[offset * 2] = cpu_to_le16(val);
>>> + s->data[offset * 2 + 1] = 0;
>>> + } else {
>>> + s->data[offset] = cpu_to_le16(val);
>>> + }
>>> }
>>> }
>>>
>>> The second change incorporated into 3fe9a838ec (and the one this patch
>>> fixes) is a similar change made to the MMIO *register* accesses:
>>>
>>> @@ -590,7 +600,7 @@ static uint64_t dp8393x_read(void *opaque, hwaddr
>>> addr, unsigned int size)
>>>
>>> DPRINTF("read 0x%04x from reg %s\n", val, reg_names[reg]);
>>>
>>> - return val;
>>> + return s->big_endian ? val << 16 : val;
>>> }
>>>
>>> and:
>>>
>>> @@ -598,13 +608,14 @@ static void dp8393x_write(void *opaque, hwaddr
>>> addr, uint64_t data,
>>> {
>>> dp8393xState *s = opaque;
>>> int reg = addr >> s->it_shift;
>>> + uint32_t val = s->big_endian ? data >> 16 : data;
>>>
>>> This is not correct since the QEMU memory API handles any access size
>>> and endian conversion before the MMIO access reaches the device. It is
>>> this change which breaks the 32-bit accesses used by MacOS to read/write
>>> the dp8393x registers because it applies an additional endian swap on
>>> top of that already done by the memory API.
>>>
>>>>>> The big_endian workaround applied to the register read/writes was
>>>>>> actually caused by forcing the access size to 32-bit when the
>>>>>> guest OS
>>>>>> was using a 16-bit access. Since the registers are 16-bit then we can
>>>>>> simply set .impl.min_access to 2 and then the memory API will
>>>>>> automatically do the right thing for both 16-bit accesses used by
>>>>>> Linux and 32-bit accesses used by the MacOS toolbox ROM.
>>>>>
>>>>> Hmm I'm not sure. This sounds to me like the "QEMU doesn't model
>>>>> busses
>>>>> so we end using kludge to hide bugs" pattern. Can you provide a QTest
>>>>> (ideally) or a "-trace memory_region_ops_\*" log of your firmware
>>>>> accessing the dp8393x please?
>>>>>
>>>> The DP83932 chip is highly configurable, so I'm not sure that the
>>>> behaviour of any given firmware would resolve the question.
>>>
>>> Indeed, I don't think that will help much here. Phil, if you would still
>>> like me to send you some traces after reading the explanation above then
>>> do let me know.
>>
>> I read it and still would have few traces. We can hand-write them too.
>>
>> I'd like to add qtests for some read/write,16/32(A)==B.
>
> Sigh. I was just about to attempt some traces when I realised looking at
> the patch that I made a mistake, and that in order for the memory API to
> automatically handle the 4 byte accesses as 2 x 2 byte accesses then
> both .impl.min_access_size and .impl.max_access_size need to be set to 2
> :( The remainder of the patch is the same but dp8393x_ops now looks
> like this:
>
> static const MemoryRegionOps dp8393x_ops = {
> .read = dp8393x_read,
> .write = dp8393x_write,
> .impl.min_access_size = 2,
> .impl.max_access_size = 2,
> .endianness = DEVICE_NATIVE_ENDIAN,
> };
Yes :) This is closer to what I wrote during my review:
-- >8 --
diff --git a/hw/net/dp8393x.c b/hw/net/dp8393x.c
index bff359ed13f..6061268dc49 100644
--- a/hw/net/dp8393x.c
+++ b/hw/net/dp8393x.c
@@ -697,7 +697,9 @@ static const MemoryRegionOps dp8393x_ops = {
.read = dp8393x_read,
.write = dp8393x_write,
.impl.min_access_size = 2,
- .impl.max_access_size = 4,
+ .impl.max_access_size = 2,
+ .valid.min_access_size = 2,
+ .valid.max_access_size = 4,
.endianness = DEVICE_NATIVE_ENDIAN,
};
---
>
> I've tested this under Linux/m68k, NetBSD/arc and MacOS and networking
> seems fine after a quick test in each OS. The slight curiosity is that
> the 4 byte accesses used by MacOS are split into 2 x 2 byte accesses,
> but since MacOS only uses the bottom 16-bit of the result and ignores
> the top 16-bits then despite there being 2 accesses, everything still
> works as expected (see below).
>
>
> READ:
>
> dp8393x_read reg=0x28 [SR] val=0x0004 size=2
> memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a0 value 0x4
> size 2
> dp8393x_read reg=0x28 [SR] val=0x0004 size=2
> memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a2 value 0x4
> size 2
> memory_region_ops_read cpu 0 mr 0x55f210c9c330 addr 0x50f0a0a0 value
> 0x40004 size 4
>
> WRITE:
>
> memory_region_ops_write cpu 0 mr 0x55f210c9c330 addr 0x50f0a01c value
> 0x248fe8 size 4
> memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01c value
> 0x24 size 2
> dp8393x_write reg=0x7 [CTDA] val=0x0024 size=2
> memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01e value
> 0x8fe8 size 2
> dp8393x_write reg=0x7 [CTDA] val=0x8fe8 size=2
>
>
> If you're happy with this, I'll resubmit a revised version of the patch
> but with an updated commit message: the Fixes: tag is still the same,
> but the updated fix is to ensure that .impl.min_access_size and
> .impl.max_access_size match the real-life 16-bit register size.
Hold on a bit more, I'm still reviewing the datasheet ;)
My code note so far (untested) are:
-- >8 --
@@ -219,34 +225,48 @@ static uint32_t dp8393x_wt(dp8393xState *s)
return s->regs[SONIC_WT1] << 16 | s->regs[SONIC_WT0];
}
-static uint16_t dp8393x_get(dp8393xState *s, int width, int offset)
+static uint16_t dp8393x_get(dp8393xState *s, hwaddr addr, unsigned ofs16)
{
+ const MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED;
uint16_t val;
- if (s->big_endian) {
- val = be16_to_cpu(s->data[offset * width + width - 1]);
+ if (s->regs[SONIC_DCR] & SONIC_DCR_DW) {
+ addr += 2 * ofs16;
+ if (s->big_endian) {
+ val = address_space_ldl_be(&s->as, addr, attrs, NULL);
+ } else {
+ val = address_space_ldl_le(&s->as, addr, attrs, NULL);
+ }
} else {
- val = le16_to_cpu(s->data[offset * width]);
+ addr += 1 * ofs16;
+ if (s->big_endian) {
+ val = address_space_lduw_be(&s->as, addr, attrs, NULL);
+ } else {
+ val = address_space_lduw_le(&s->as, addr, attrs, NULL);
+ }
}
+
return val;
}
-static void dp8393x_put(dp8393xState *s, int width, int offset,
- uint16_t val)
+static void dp8393x_put(dp8393xState *s,
+ hwaddr addr, unsigned ofs16, uint16_t val)
{
- if (s->big_endian) {
- if (width == 2) {
- s->data[offset * 2] = 0;
- s->data[offset * 2 + 1] = cpu_to_be16(val);
+ const MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED;
+
+ if (s->regs[SONIC_DCR] & SONIC_DCR_DW) {
+ addr += 2 * ofs16;
+ if (s->big_endian) {
+ address_space_stl_be(&s->as, addr, val, attrs, NULL);
} else {
- s->data[offset] = cpu_to_be16(val);
+ address_space_stl_le(&s->as, addr, val, attrs, NULL);
}
} else {
- if (width == 2) {
- s->data[offset * 2] = cpu_to_le16(val);
- s->data[offset * 2 + 1] = 0;
+ addr += 1 * ofs16;
+ if (s->big_endian) {
+ address_space_stw_be(&s->as, addr, val, attrs, NULL);
} else {
- s->data[offset] = cpu_to_le16(val);
+ address_space_stw_le(&s->as, addr, val, attrs, NULL);
}
}
}
@@ -313,14 +331,12 @@ static void dp8393x_do_read_rra(dp8393xState *s)
/* Read memory */
width = (s->regs[SONIC_DCR] & SONIC_DCR_DW) ? 2 : 1;
size = sizeof(uint16_t) * 4 * width;
- address_space_read(&s->as, dp8393x_rrp(s),
- MEMTXATTRS_UNSPECIFIED, s->data, size);
/* Update SONIC registers */
- s->regs[SONIC_CRBA0] = dp8393x_get(s, width, 0);
- s->regs[SONIC_CRBA1] = dp8393x_get(s, width, 1);
- s->regs[SONIC_RBWC0] = dp8393x_get(s, width, 2);
- s->regs[SONIC_RBWC1] = dp8393x_get(s, width, 3);
+ s->regs[SONIC_CRBA0] = dp8393x_get(s, dp8393x_rrp(s), 0);
+ s->regs[SONIC_CRBA1] = dp8393x_get(s, dp8393x_rrp(s), 1);
+ s->regs[SONIC_RBWC0] = dp8393x_get(s, dp8393x_rrp(s), 2);
+ s->regs[SONIC_RBWC1] = dp8393x_get(s, dp8393x_rrp(s), 3);
trace_dp8393x_read_rra_regs(s->regs[SONIC_CRBA0], s->regs[SONIC_CRBA1],
s->regs[SONIC_RBWC0],
s->regs[SONIC_RBWC1]);
@@ -416,28 +432,22 @@ static void
dp8393x_do_receiver_disable(dp8393xState *s)
static void dp8393x_do_transmit_packets(dp8393xState *s)
{
NetClientState *nc = qemu_get_queue(s->nic);
- int width, size;
int tx_len, len;
uint16_t i;
- width = (s->regs[SONIC_DCR] & SONIC_DCR_DW) ? 2 : 1;
-
while (1) {
/* Read memory */
- size = sizeof(uint16_t) * 6 * width;
s->regs[SONIC_TTDA] = s->regs[SONIC_CTDA];
trace_dp8393x_transmit_packet(dp8393x_ttda(s));
- address_space_read(&s->as, dp8393x_ttda(s) + sizeof(uint16_t) *
width,
- MEMTXATTRS_UNSPECIFIED, s->data, size);
tx_len = 0;
/* Update registers */
- s->regs[SONIC_TCR] = dp8393x_get(s, width, 0) & 0xf000;
- s->regs[SONIC_TPS] = dp8393x_get(s, width, 1);
- s->regs[SONIC_TFC] = dp8393x_get(s, width, 2);
- s->regs[SONIC_TSA0] = dp8393x_get(s, width, 3);
- s->regs[SONIC_TSA1] = dp8393x_get(s, width, 4);
- s->regs[SONIC_TFS] = dp8393x_get(s, width, 5);
+ s->regs[SONIC_TCR] = dp8393x_get(s, dp8393x_ttda(s), 0) & 0xf000;
+ s->regs[SONIC_TPS] = dp8393x_get(s, dp8393x_ttda(s), 1);
+ s->regs[SONIC_TFC] = dp8393x_get(s, dp8393x_ttda(s), 2);
+ s->regs[SONIC_TSA0] = dp8393x_get(s, dp8393x_ttda(s), 3);
+ s->regs[SONIC_TSA1] = dp8393x_get(s, dp8393x_ttda(s), 4);
+ s->regs[SONIC_TFS] = dp8393x_get(s, dp8393x_ttda(s), 5);
/* Handle programmable interrupt */
if (s->regs[SONIC_TCR] & SONIC_TCR_PINT) {
@@ -459,15 +469,9 @@ static void
dp8393x_do_transmit_packets(dp8393xState *s)
i++;
if (i != s->regs[SONIC_TFC]) {
/* Read next fragment details */
- size = sizeof(uint16_t) * 3 * width;
- address_space_read(&s->as,
- dp8393x_ttda(s)
- + sizeof(uint16_t) * width * (4 + 3
* i),
- MEMTXATTRS_UNSPECIFIED, s->data,
- size);
- s->regs[SONIC_TSA0] = dp8393x_get(s, width, 0);
- s->regs[SONIC_TSA1] = dp8393x_get(s, width, 1);
- s->regs[SONIC_TFS] = dp8393x_get(s, width, 2);
+ s->regs[SONIC_TSA0] = dp8393x_get(s, dp8393x_ttda(s), 0);
+ s->regs[SONIC_TSA1] = dp8393x_get(s, dp8393x_ttda(s), 1);
+ s->regs[SONIC_TFS] = dp8393x_get(s, dp8393x_ttda(s), 2);
}
}
@@ -500,22 +504,12 @@ static void
dp8393x_do_transmit_packets(dp8393xState *s)
s->regs[SONIC_TCR] |= SONIC_TCR_PTX;
/* Write status */
- dp8393x_put(s, width, 0,
- s->regs[SONIC_TCR] & 0x0fff); /* status */
- size = sizeof(uint16_t) * width;
- address_space_write(&s->as, dp8393x_ttda(s),
- MEMTXATTRS_UNSPECIFIED, s->data, size);
+ dp8393x_put(s, dp8393x_ttda(s), 0, s->regs[SONIC_TCR] & 0x0fff);
if (!(s->regs[SONIC_CR] & SONIC_CR_HTX)) {
/* Read footer of packet */
- size = sizeof(uint16_t) * width;
- address_space_read(&s->as,
- dp8393x_ttda(s)
- + sizeof(uint16_t) * width
- * (4 + 3 * s->regs[SONIC_TFC]),
- MEMTXATTRS_UNSPECIFIED, s->data,
- size);
- s->regs[SONIC_CTDA] = dp8393x_get(s, width, 0);
+ s->regs[SONIC_CTDA] = dp8393x_get(s, dp8393x_ttda(s),
+ s->regs[SONIC_TFC]);
if (s->regs[SONIC_CTDA] & SONIC_DESC_EOL) {
/* EOL detected */
break;
@@ -764,7 +759,7 @@ static ssize_t dp8393x_receive(NetClientState *nc,
const uint8_t * buf,
dp8393xState *s = qemu_get_nic_opaque(nc);
int packet_type;
uint32_t available, address;
- int width, rx_len, padded_len;
+ int rx_len, padded_len;
uint32_t checksum;
int size;
@@ -777,10 +772,8 @@ static ssize_t dp8393x_receive(NetClientState *nc,
const uint8_t * buf,
rx_len = pkt_size + sizeof(checksum);
if (s->regs[SONIC_DCR] & SONIC_DCR_DW) {
- width = 2;
padded_len = ((rx_len - 1) | 3) + 1;
} else {
- width = 1;
padded_len = ((rx_len - 1) | 1) + 1;
}
@@ -801,11 +794,7 @@ static ssize_t dp8393x_receive(NetClientState *nc,
const uint8_t * buf,
/* Check for EOL */
if (s->regs[SONIC_LLFA] & SONIC_DESC_EOL) {
/* Are we still in resource exhaustion? */
- size = sizeof(uint16_t) * 1 * width;
- address = dp8393x_crda(s) + sizeof(uint16_t) * 5 * width;
- address_space_read(&s->as, address, MEMTXATTRS_UNSPECIFIED,
- s->data, size);
- s->regs[SONIC_LLFA] = dp8393x_get(s, width, 0);
+ s->regs[SONIC_LLFA] = dp8393x_get(s, dp8393x_crda(s), 5);
if (s->regs[SONIC_LLFA] & SONIC_DESC_EOL) {
/* Still EOL ; stop reception */
return -1;
@@ -813,11 +802,7 @@ static ssize_t dp8393x_receive(NetClientState *nc,
const uint8_t * buf,
/* Link has been updated by host */
/* Clear in_use */
- size = sizeof(uint16_t) * width;
- address = dp8393x_crda(s) + sizeof(uint16_t) * 6 * width;
- dp8393x_put(s, width, 0, 0);
- address_space_rw(&s->as, address, MEMTXATTRS_UNSPECIFIED,
- (uint8_t *)s->data, size, 1);
+ dp8393x_put(s, dp8393x_crda(s), 6, 0x0000);
/* Move to next descriptor */
s->regs[SONIC_CRDA] = s->regs[SONIC_LLFA];
@@ -846,8 +831,8 @@ static ssize_t dp8393x_receive(NetClientState *nc,
const uint8_t * buf,
/* Pad short packets to keep pointers aligned */
if (rx_len < padded_len) {
size = padded_len - rx_len;
- address_space_rw(&s->as, address, MEMTXATTRS_UNSPECIFIED,
- (uint8_t *)"\xFF\xFF\xFF", size, 1);
+ address_space_write(&s->as, address, MEMTXATTRS_UNSPECIFIED,
+ (uint8_t *)"\xFF\xFF\xFF", size);
address += size;
}
@@ -871,32 +856,20 @@ static ssize_t dp8393x_receive(NetClientState *nc,
const uint8_t * buf,
/* Write status to memory */
trace_dp8393x_receive_write_status(dp8393x_crda(s));
- dp8393x_put(s, width, 0, s->regs[SONIC_RCR]); /* status */
- dp8393x_put(s, width, 1, rx_len); /* byte count */
- dp8393x_put(s, width, 2, s->regs[SONIC_TRBA0]); /* pkt_ptr0 */
- dp8393x_put(s, width, 3, s->regs[SONIC_TRBA1]); /* pkt_ptr1 */
- dp8393x_put(s, width, 4, s->regs[SONIC_RSC]); /* seq_no */
- size = sizeof(uint16_t) * 5 * width;
- address_space_write(&s->as, dp8393x_crda(s),
- MEMTXATTRS_UNSPECIFIED,
- s->data, size);
+ dp8393x_put(s, dp8393x_crda(s), 0, s->regs[SONIC_RCR]); /* status */
+ dp8393x_put(s, dp8393x_crda(s), 1, rx_len); /* byte count */
+ dp8393x_put(s, dp8393x_crda(s), 2, s->regs[SONIC_TRBA0]); /*
pkt_ptr0 */
+ dp8393x_put(s, dp8393x_crda(s), 3, s->regs[SONIC_TRBA1]); /*
pkt_ptr1 */
+ dp8393x_put(s, dp8393x_crda(s), 4, s->regs[SONIC_RSC]); /* seq_no */
/* Check link field */
- size = sizeof(uint16_t) * width;
- address_space_read(&s->as,
- dp8393x_crda(s) + sizeof(uint16_t) * 5 * width,
- MEMTXATTRS_UNSPECIFIED, s->data, size);
- s->regs[SONIC_LLFA] = dp8393x_get(s, width, 0);
+ s->regs[SONIC_LLFA] = dp8393x_get(s, dp8393x_crda(s), 5);
if (s->regs[SONIC_LLFA] & SONIC_DESC_EOL) {
/* EOL detected */
s->regs[SONIC_ISR] |= SONIC_ISR_RDE;
} else {
/* Clear in_use */
- size = sizeof(uint16_t) * width;
- address = dp8393x_crda(s) + sizeof(uint16_t) * 6 * width;
- dp8393x_put(s, width, 0, 0);
- address_space_write(&s->as, address, MEMTXATTRS_UNSPECIFIED,
- s->data, size);
+ dp8393x_put(s, dp8393x_crda(s), 6, 0x0000);
/* Move to next descriptor */
s->regs[SONIC_CRDA] = s->regs[SONIC_LLFA];
---
Now I'll look at the CAM then the cksum.
Regards,
Phil.