Cord Amfmgm <dmamfmgm@gmail.com> writes:
> On Tue, May 28, 2024 at 11:32 AM Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Tue, 28 May 2024 at 16:37, Cord Amfmgm <dmamfmgm@gmail.com> wrote:
> >
> > On Tue, May 28, 2024 at 9:03 AM Peter Maydell <peter.maydell@linaro.org> wrote:
> >>
> >> On Mon, 20 May 2024 at 23:24, Cord Amfmgm <dmamfmgm@gmail.com> wrote:
> >> > On Mon, May 20, 2024 at 12:05 PM Peter Maydell <peter.maydell@linaro.org> wrote:
<snip>
> >> > And here's an example buffer of length 0 -- you probably already know what I'm going to do here:
> >> >
> >> > char buf[0];
> >> > char * CurrentBufferPointer = &buf[0];
> >> > char * BufferEnd = &buf[-1]; // "address of the last byte in the buffer"
> >> > // The OHCI Host Controller than advances CurrentBufferPointer like this: CurrentBufferPointer += 0
> >> > // After the transfer:
> >> > // CurrentBufferPointer = &buf[0];
> >> > // BufferEnd = &buf[-1];
> >>
> >> Right, but why do you think this is valid, rather than
> >> being a guest software bug? My reading of the spec is that it's
> >> pretty clear about how to say "zero length buffer", and this
> >> isn't it.
> >>
> >> Is there some real-world guest OS that programs the OHCI
> >> controller this way that we're trying to accommodate?
> >
> >
> > qemu versions 4.2 and before allowed this behavior.
>
> So? That might just mean we had a bug and we fixed it.
> 4.2 is a very old version of QEMU and nobody seems to have
> complained in the four years since we released 5.0 about this,
> which suggests that generally guest OS drivers don't try
> to send zero-length buffers in this way.
>
> > I don't think it's valid to ask for a *popular* guest OS as a proof-of-concept because I'm not an expert on those.
>
> I didn't ask for "popular"; I asked for "real-world".
> What is the actual guest code you're running that falls over
> because of the behaviour change?
>
> More generally, why do you want this behaviour to be
> changed? Reasonable reasons might include:
> * we're out of spec based on reading the documentation
> * you're trying to run some old Windows VM/QNX/etc image,
> and it doesn't work any more
> * all the real hardware we tested behaves this way
>
> But don't necessarily include:
> * something somebody wrote and only tested on QEMU happens to
> assume the old behaviour rather than following the hw spec
>
> QEMU occasionally works around guest OS bugs, but only as
> when we really have to. It's usually better to fix the
> bug in the guest.
>
> It's not, and I've already demonstrated that real hardware is consistent with the fix in this patch.
>
> Please check your tone.
I don't think that is a particularly helpful comment for someone who is
taking the time to review your patches. Reading through the thread I
didn't see anything that said this is how real HW behaves but I may well
have missed it. However you have a number of review comments to address
so I suggest you spin a v2 of the series to address them and outline the
reason to accept an out of spec transaction.
I did a rework of the patch -- see my email from May 20, quoted below -- and I was under the impression it addressed all the review comments. Did I miss something? I apologize if I did.
> index acd6016980..71b54914d3 100644
> --- a/hw/usb/hcd-ohci.c
> +++ b/hw/usb/hcd-ohci.c
> @@ -941,8 +941,8 @@ static int ohci_service_td(OHCIState *ohci, struct ohci_ed *ed)
> if ((td.cbp & 0xfffff000) != (
td.be & 0xfffff000)) {
> len = (
td.be & 0xfff) + 0x1001 - (td.cbp & 0xfff);
> } else {
> - if (td.cbp >
td.be) {
> - trace_usb_ohci_iso_td_bad_cc_overrun(td.cbp,
td.be);
> + if (td.cbp - 1 >
td.be) { /* rely on td.cbp != 0 */
> Reading through the thread I didn't see anything that said this is how real HW behaves but I may well have missed it.
This is what I wrote regarding real HW:
Results are:
qemu 4.2 | qemu HEAD | actual HW
------------+------------+------------
works fine | ohci_die() | works fine
Would additional verification of the actual HW be useful?
Peter posted the following which is more specific than "qemu 4.2" -- I agree this is most likely the qemu commit where this thread is focused:
> Almost certainly this was commit 1328fe0c32d54 ("hw: usb: hcd-ohci:
> check len and frame_number variables"), which added these bounds
> checks. Prior to that we did no bounds checking at all, which
> meant that we permitted cbp=be+1 to mean a zero length, but also
> that we permitted the guest to overrun host-side buffers by
> specifying completely bogus cbp and be values. The timeframe is
> more or less right (2020), at least.
>
> -- PMM
Where does the conversation go from here? I'm under the impression I have provided objective answers to all the questions and resolved all review comments on the code. I receive the feedback that I missed something - please restate the question?