qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/9] hw/block: m25p80: Fix the mess of dummy bytes needed for


From: Francisco Iglesias
Subject: Re: [PATCH 0/9] hw/block: m25p80: Fix the mess of dummy bytes needed for fast read commands
Date: Tue, 27 Apr 2021 10:54:33 +0200
User-agent: Mutt/1.10.1 (2018-07-13)

On [2021 Apr 27] Tue 15:56:10, Alistair Francis wrote:
> On Fri, Apr 23, 2021 at 4:46 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> >
> > On Mon, Feb 8, 2021 at 10:41 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > >
> > > On Thu, Jan 21, 2021 at 10:18 PM Francisco Iglesias
> > > <frasse.iglesias@gmail.com> wrote:
> > > >
> > > > Hi Bin,
> > > >
> > > > On [2021 Jan 21] Thu 16:59:51, Bin Meng wrote:
> > > > > Hi Francisco,
> > > > >
> > > > > On Thu, Jan 21, 2021 at 4:50 PM Francisco Iglesias
> > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > >
> > > > > > Dear Bin,
> > > > > >
> > > > > > On [2021 Jan 20] Wed 22:20:25, Bin Meng wrote:
> > > > > > > Hi Francisco,
> > > > > > >
> > > > > > > On Tue, Jan 19, 2021 at 9:01 PM Francisco Iglesias
> > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > >
> > > > > > > > Hi Bin,
> > > > > > > >
> > > > > > > > On [2021 Jan 18] Mon 20:32:19, Bin Meng wrote:
> > > > > > > > > Hi Francisco,
> > > > > > > > >
> > > > > > > > > On Mon, Jan 18, 2021 at 6:06 PM Francisco Iglesias
> > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > Hi Bin,
> > > > > > > > > >
> > > > > > > > > > On [2021 Jan 15] Fri 22:38:18, Bin Meng wrote:
> > > > > > > > > > > Hi Francisco,
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Jan 15, 2021 at 8:26 PM Francisco Iglesias
> > > > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Hi Bin,
> > > > > > > > > > > >
> > > > > > > > > > > > On [2021 Jan 15] Fri 10:07:52, Bin Meng wrote:
> > > > > > > > > > > > > Hi Francisco,
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, Jan 15, 2021 at 2:13 AM Francisco Iglesias
> > > > > > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Bin,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On [2021 Jan 14] Thu 23:08:53, Bin Meng wrote:
> > > > > > > > > > > > > > > From: Bin Meng <bin.meng@windriver.com>
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The m25p80 model uses s->needed_bytes to indicate 
> > > > > > > > > > > > > > > how many follow-up
> > > > > > > > > > > > > > > bytes are expected to be received after it 
> > > > > > > > > > > > > > > receives a command. For
> > > > > > > > > > > > > > > example, depending on the address mode, either 
> > > > > > > > > > > > > > > 3-byte address or
> > > > > > > > > > > > > > > 4-byte address is needed.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > For fast read family commands, some dummy cycles 
> > > > > > > > > > > > > > > are required after
> > > > > > > > > > > > > > > sending the address bytes, and the dummy cycles 
> > > > > > > > > > > > > > > need to be counted
> > > > > > > > > > > > > > > in s->needed_bytes. This is where the mess began.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > As the variable name (needed_bytes) indicates, 
> > > > > > > > > > > > > > > the unit is in byte.
> > > > > > > > > > > > > > > It is not in bit, or cycle. However for some 
> > > > > > > > > > > > > > > reason the model has
> > > > > > > > > > > > > > > been using the number of dummy cycles for 
> > > > > > > > > > > > > > > s->needed_bytes. The right
> > > > > > > > > > > > > > > approach is to convert the number of dummy cycles 
> > > > > > > > > > > > > > > to bytes based on
> > > > > > > > > > > > > > > the SPI protocol, for example, 6 dummy cycles for 
> > > > > > > > > > > > > > > the Fast Read Quad
> > > > > > > > > > > > > > > I/O (EBh) should be converted to 3 bytes per the 
> > > > > > > > > > > > > > > formula (6 * 4 / 8).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > While not being the original implementor I must 
> > > > > > > > > > > > > > assume that above solution was
> > > > > > > > > > > > > > considered but not chosen by the developers due to 
> > > > > > > > > > > > > > it is inaccuracy (it
> > > > > > > > > > > > > > wouldn't be possible to model exacly 6 dummy 
> > > > > > > > > > > > > > cycles, only a multiple of 8,
> > > > > > > > > > > > > > meaning that if the controller is wrongly 
> > > > > > > > > > > > > > programmed to generate 7 the error
> > > > > > > > > > > > > > wouldn't be caught and the controller will still be 
> > > > > > > > > > > > > > considered "correct"). Now
> > > > > > > > > > > > > > that we have this detail in the implementation I'm 
> > > > > > > > > > > > > > in favor of keeping it, this
> > > > > > > > > > > > > > also because the detail is already in use for 
> > > > > > > > > > > > > > catching exactly above error.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > I found no clue from the commit message that my 
> > > > > > > > > > > > > proposed solution here
> > > > > > > > > > > > > was ever considered, otherwise all SPI controller 
> > > > > > > > > > > > > models supporting
> > > > > > > > > > > > > software generation should have been found out 
> > > > > > > > > > > > > seriously broken long
> > > > > > > > > > > > > time ago!
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > The controllers you are referring to might lack support 
> > > > > > > > > > > > for commands requiring
> > > > > > > > > > > > dummy clock cycles but I really hope they work with the 
> > > > > > > > > > > > other commands? If so I
> > > > > > > > > > >
> > > > > > > > > > > I am not sure why you view dummy clock cycles as 
> > > > > > > > > > > something special
> > > > > > > > > > > that needs some special support from the SPI controller. 
> > > > > > > > > > > For the case
> > > > > > > > > > > 1 controller, it's nothing special from the controller 
> > > > > > > > > > > perspective,
> > > > > > > > > > > just like sending out a command, or address bytes, or 
> > > > > > > > > > > data. The
> > > > > > > > > > > controller just shifts data bit by bit from its tx fifo 
> > > > > > > > > > > and that's it.
> > > > > > > > > > > In the Xilinx GQSPI controller case, the dummy cycles can 
> > > > > > > > > > > either be
> > > > > > > > > > > sent via a regular data (the case 1 controller) in the tx 
> > > > > > > > > > > fifo, or
> > > > > > > > > > > automatically generated (case 2 controller) by the 
> > > > > > > > > > > hardware.
> > > > > > > > > >
> > > > > > > > > > Ok, I'll try to explain my view point a little differently. 
> > > > > > > > > > For that we also
> > > > > > > > > > need to keep in mind that QEMU models HW, and any binary 
> > > > > > > > > > that runs on a HW
> > > > > > > > > > board supported in QEMU should ideally run on that board 
> > > > > > > > > > inside QEMU aswell
> > > > > > > > > > (this can be a bare metal application equaly well as a 
> > > > > > > > > > modified u-boot/Linux
> > > > > > > > > > using SPI commands with a non multiple of 8 number of dummy 
> > > > > > > > > > clock cycles).
> > > > > > > > > >
> > > > > > > > > > Once functionality has been introduced into QEMU it is not 
> > > > > > > > > > easy to know which
> > > > > > > > > > intentional or untentional features provided by the 
> > > > > > > > > > functionality are being
> > > > > > > > > > used by users. One of the (perhaps not well known) features 
> > > > > > > > > > I'm aware of that
> > > > > > > > > > is in use and is provided by the accurate dummy clock cycle 
> > > > > > > > > > modeling inside
> > > > > > > > > > m25p80 is the be ability to test drivers accurately 
> > > > > > > > > > regarding the dummy clock
> > > > > > > > > > cycles (even when using commands with a non-multiple of 8 
> > > > > > > > > > number of dummy clock
> > > > > > > > > > cycles), but there might be others aswell. So by removing 
> > > > > > > > > > this functionality
> > > > > > > > > > above use case will brake, this since those test will not 
> > > > > > > > > > be reliable.
> > > > > > > > > > Furthermore, since users tend to be creative it is not 
> > > > > > > > > > possible to know if
> > > > > > > > > > there are other use cases that will be affected. This means 
> > > > > > > > > > that in case [1]
> > > > > > > > > > needs to be followed the safe path is to add functionality 
> > > > > > > > > > instead of removing.
> > > > > > > > > > Luckily it also easier in this case, see below.
> > > > > > > > >
> > > > > > > > > I understand there might be users other than U-Boot/Linux 
> > > > > > > > > that use an
> > > > > > > > > odd number of dummy bits (not multiple of 8). If your concern 
> > > > > > > > > was
> > > > > > > > > about model behavior changes, sure I can update
> > > > > > > > > qemu/docs/system/deprecated.rst to mention that some flashes 
> > > > > > > > > in the
> > > > > > > > > m25p80 model now implement dummy cycles as bytes.
> > > > > > > >
> > > > > > > > Yes, something like that. My concern is that since this 
> > > > > > > > functionality has been
> > > > > > > > in tree for while, users have found known or unknown features 
> > > > > > > > that got
> > > > > > > > introduced by it. By removing the functionality (and the 
> > > > > > > > known/uknown features)
> > > > > > > > we are riscing to brake our user's use cases (currently I'm 
> > > > > > > > aware of one
> > > > > > > > feature/use case but it is not unlikely that there are more). 
> > > > > > > > [1] states that
> > > > > > > > "In general features are intended to be supported indefinitely 
> > > > > > > > once introduced
> > > > > > > > into QEMU", to me that makes very much sense because the 
> > > > > > > > opposite would mean
> > > > > > > > that we were not reliable. So in case [1] needs to be honored 
> > > > > > > > it looks to be
> > > > > > > > safer to add functionality instead of removing (and riscing the 
> > > > > > > > removal of use
> > > > > > > > cases/features). Luckily I still believe in this case that it 
> > > > > > > > will be easier to
> > > > > > > > go forward (even if I also agree on what you are saying below 
> > > > > > > > about what I
> > > > > > > > proposed).
> > > > > > > >
> > > > > > >
> > > > > > > Even if the implementation is buggy and we need to keep the buggy
> > > > > > > implementation forever? I think that's why
> > > > > > > qemu/docs/system/deprecated.rst was created for deprecating such
> > > > > > > feature.
> > > > > >
> > > > > > With the RFC I posted all commands in m25p80 are working for both 
> > > > > > the case 1
> > > > > > controller (using a txfifo) and the case 2 controller (no txfifo, 
> > > > > > as GQSPI).
> > > > > > Because of this, I, with all respect, will have to disagree that 
> > > > > > this is buggy.
> > > > >
> > > > > Well, the existing m25p80 implementation that uses dummy cycle
> > > > > accuracy for those flashes prevents all SPI controllers that use tx
> > > > > fifo to work with those flashes. Hence it is buggy.
> > > > >
> > > > > >
> > > > > > >
> > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > don't think it is fair to call them 'seriously broken' 
> > > > > > > > > > > > (and else we should
> > > > > > > > > > > > probably let the maintainers know about it). Most 
> > > > > > > > > > > > likely the lack of support
> > > > > > > > > > >
> > > > > > > > > > > I called it "seriously broken" because current 
> > > > > > > > > > > implementation only
> > > > > > > > > > > considered one type of SPI controllers while completely 
> > > > > > > > > > > ignoring the
> > > > > > > > > > > other type.
> > > > > > > > > >
> > > > > > > > > > If we change view and see this from the perspective of 
> > > > > > > > > > m25p80, it models the
> > > > > > > > > > commands a certain way and provides an API that the SPI 
> > > > > > > > > > controllers need to
> > > > > > > > > > implement for interacting with it. It is true that there 
> > > > > > > > > > are SPI controllers
> > > > > > > > > > referred to above that do not support the portion of that 
> > > > > > > > > > API that corresponds
> > > > > > > > > > to commands with dummy clock cycles, but I don't think it 
> > > > > > > > > > is true that this is
> > > > > > > > > > broken since there is also one SPI controller that has a 
> > > > > > > > > > working implementation
> > > > > > > > > > of m25p80's full API also when transfering through a tx 
> > > > > > > > > > fifo (use case 1). But
> > > > > > > > > > as mentioned above, by doing a minor extension and 
> > > > > > > > > > improvement to m25p80's API
> > > > > > > > > > and allow for toggling the accuracy from dummy clock cycles 
> > > > > > > > > > to dummy bytes [1]
> > > > > > > > > > will still be honored as in the same time making it 
> > > > > > > > > > possible to have full
> > > > > > > > > > support for the API in the SPI controllers that currently 
> > > > > > > > > > do not (please reread
> > > > > > > > > > the proposal in my previous reply that attempts to do 
> > > > > > > > > > this). I myself see this
> > > > > > > > > > as win/win situation, also because no controller should 
> > > > > > > > > > need modifications.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I am afraid your proposal does not work. Your proposed new 
> > > > > > > > > device
> > > > > > > > > property 'model_dummy_bytes' to select to convert the 
> > > > > > > > > accurate dummy
> > > > > > > > > clock cycle count to dummy bytes inside m25p80, is hard to 
> > > > > > > > > justify as
> > > > > > > > > a property to the flash itself, as the behavior is tightly 
> > > > > > > > > coupled to
> > > > > > > > > how the SPI controller works.
> > > > > > > >
> > > > > > > > I agree on above. I decided though that instead of posting 
> > > > > > > > sample code in here
> > > > > > > > I'll post an RFC with hopefully an improved proposal. I'll cc 
> > > > > > > > you. About below,
> > > > > > > > Xilinx ZynqMP GQSPI should not need any modication in a first 
> > > > > > > > step.
> > > > > > > >
> > > > > > >
> > > > > > > Wait, (see below)
> > > > > > >
> > > > > > > > >
> > > > > > > > > Please take a look at the Xilinx GQSPI controller, which 
> > > > > > > > > supports both
> > > > > > > > > use cases, that the dummy cycles can be transferred via tx 
> > > > > > > > > fifo, or
> > > > > > > > > generated by the controller automatically. Please read the 
> > > > > > > > > example
> > > > > > > > > given in:
> > > > > > > > >
> > > > > > > > >     table 24‐22, an example of Generic FIFO Contents for Quad 
> > > > > > > > > I/O Read
> > > > > > > > > Command (EBh)
> > > > > > > > >
> > > > > > > > > in 
> > > > > > > > > https://www.xilinx.com/support/documentation/user_guides/ug1085-zynq-ultrascale-trm.pdf
> > > > > > > > >
> > > > > > > > > If you choose to set the m25p80 device property 
> > > > > > > > > 'model_dummy_bytes' to
> > > > > > > > > true when working with the Xilinx GQSPI controller, you are 
> > > > > > > > > bound to
> > > > > > > > > only allow guest software to use tx fifo to transfer the 
> > > > > > > > > dummy cycles,
> > > > > > > > > and this is wrong.
> > > > > > > > >
> > > > > > >
> > > > > > > You missed this part. I looked at your RFC, and as I mentioned 
> > > > > > > above
> > > > > > > your proposal cannot support the complicated controller like 
> > > > > > > Xilinx
> > > > > > > GQSPI. Please read the example of table 24-22. With your RFC, you
> > > > > > > mandate guest software's GQSPI driver to only use hardware dummy 
> > > > > > > cycle
> > > > > > > generation, which is wrong.
> > > > > > >
> > > > > >
> > > > > > First, thank you very much for looking into the RFC series, very 
> > > > > > much
> > > > > > appreciated. Secondly, about above, the GQSPI model in QEMU 
> > > > > > transfers from 2
> > > > > > locations in the file, in 1 location the transfer referred to above 
> > > > > > is done, in
> > > > > > another location the transfer through the txfifo is done. The 
> > > > > > location where
> > > > > > transfer referred to above is done will not need any modifications 
> > > > > > (and will
> > > > > > thus work equally well as it does currently).
> > > > >
> > > > > Please explain this a little bit. How does your RFC series handle
> > > > > cases as described in table 24-22, where the 6 dummy cycles are split
> > > > > into 2 transfers, with one transfer using tx fifo, and the other one
> > > > > using hardware dummy cycle generation?
> > > >
> > > > Sorry, I missunderstod. You are right, that won't work.
> > >
> > > +Edgar E. Iglesias
> > >
> > > So it looks by far the only way to implement dummy cycles correctly to
> > > work with all SPI controller models is what I proposed here in this
> > > patch series.
> > >
> > > Maintainers are quite silent, so I would like to hear your thoughts.
> > >
> > > @Alistair Francis @Philippe Mathieu-Daudé @Peter Maydell would you
> > > please share your thoughts since you are the one who reviewed the
> > > existing dummy implementation (based on commits history)
> 
> I agree with Edgar, in that Francisco and Bin know this better than me
> and that modelling things in cycles is a pain.

Hi Alistair,

> 
> As Bin points out it seems like currently we should be modelling bytes
> (from the variable name) so it makes sense to keep it in bytes. I
> would be in favour of this series in that case. Do we know what use
> cases this will break? I know it's hard to answer but I don't think
> there are too many SSI users in QEMU so it might not be too hard to
> test most of the possible use cases.

The use case I'm aware of is regression testing of drivers. Ex: if a
driver is using 10 dummy clock cycles with the commands and a patch
accidentaly changes the driver to use 11 dummy clock cycles QEMU currently
finds the problem, that won't be possible with this series. It's difficult
to say but it is not impossible there are other use cases also.

More importantly IMO though is that the current use cases can be keept
while still providing support for commands with dummy clock cycles into
the QEMU SPI controllers lacking at the moment.

(If I recall correctly this series might also have another issue regarding
the GQSPI SPI mode configuration, with that it is possible transmit 8
dummy clock cycles as 1 data byte, 2 data bytes or 4 data bytes, so I
think some form of calculation might be needed inside m25p80).

Best regards,
Francisco


> 
> Alistair
> 
> >
> > Hello maintainers,
> >
> > We apparently missed the 6.0 window to address this mess of the m25p80
> > model. Please provide your inputs on this before I start working on
> > the v2.
> >
> > Regards,
> > Bin
> >



reply via email to

[Prev in Thread] Current Thread [Next in Thread]