qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] qtest: add read/write accessors with a specific


From: David Gibson
Subject: Re: [Qemu-devel] [PATCH] qtest: add read/write accessors with a specific endianness
Date: Fri, 7 Oct 2016 10:31:01 +1100
User-agent: Mutt/1.7.0 (2016-08-17)

On Thu, Oct 06, 2016 at 12:03:34PM +0100, Peter Maydell wrote:
> On 6 October 2016 at 04:38, David Gibson <address@hidden> wrote:
> > On Wed, Oct 05, 2016 at 05:31:07AM -0700, Peter Maydell wrote:
> >> On 4 October 2016 at 16:43, David Gibson <address@hidden> wrote:
> >> > On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote:
> >> >> The difficulty with this patch is that it's hard to tell whether
> >> >> it's really required, or if this is just adding an extra layer
> >> >> of byteswapping that should really be done in some other location
> >> >
> >> > Actually, it's neither.  It's not essential for anything, but it
> >> > *removes* an extra layer of byteswapping that really never should have
> >> > been done in the first place.
> >>
> >> The patch is very clearly adding calls to swapping functions.
> >> It looks like it's mostly convenience functions for not doing
> >> those swaps explicitly in the test cases.
> >
> > It's adding 1 swap on top of the memread/memwrite path - that's the
> > path which had no existing swaps (intended primarily for bag-o'-bytes
> > block access AFAICT).
> 
> Yeah, I hadn't noticed it was using the memread/memwrite path.
> I disagree with using that code path for what ought to be
> register read/writes (among other things, it's not clear to me
> that it guarantees that a 4-byte access by the test code is
> always a 4-byte access on the device, etc).

I was concerned about that when I first saw it as well.

But, I traced the path and both the memread() path and the readw/readl
path backend onto cpu_physical_memory_read() with the appropriate
width.  So if the proposed ops are broken in this way, so are
readw/readl.

If we're concerned that that won't always be the case we can still
implement the same operations, but instead of having them be just
wrappers on memread/memwrite, they'd be new primitives passed over the
pipe to the qtest accelerator.

> 
> >> >> in the stack. What's the actual test case here?
> >> >
> >> > The current readw, readl, etc. all work in "guest endianness".  But
> >> > guest endianness is not well defined - there are a number of targets
> >> > which can support either.
> >>
> >> It's guest bus endianness, and it's pretty well defined I think.
> >> (ARM for instance is LE bus even if the CPU is doing BE writes.)
> >
> > I don't see that guest bus endianness is any better defined, or any
> > more useful than "guest endianness".  It might have a vague meaning
> > for ARM (or embedded Power) in the sense that the on-SoC devices will
> > use that endianness.  But since the SoC devices are generally unique
> > to the architecture anyway, you still know their endianness
> > independent of any notion of guest endianness.
> 
> It means "the endianness that you would see if you could snoop
> the data bus at the output of the CPU".

Hrm.  I see two cases here, neither of them makes this clear:

1) The bus spec defines which data lines are MSB, and which are LSB.
In this case, the endianness depends on how those are mapped to byte
addresses when you get out to RAM - that could involve several
intermediate bridges.

2) The bus spec defines which data lines are byte0, 1, 2, etc (in byte
address order).  In this case it really is the CPU which determines
the endianness of its accesses - and that in turn could depend on
modes, TLB entries or instruction formss within the CPU.  So the
question is: the endianness you see if you snoop the bus when the CPU
does... what.. exactly?

> It is definitely well
> defined for every QEMU target because that's what TARGET_WORDS_BIGENDIAN
> tells you. It's not the same as whatever the device thinks it
> might interpret values as (though obviously people don't
> often design systems where they differ).

Honestly I think TAGET_WORDS_BIGENDIAN is just a hangover from when
nearly all CPUs worked exclusively in one endianness.  It's never
terribly well defined.

> >> >  And it's doubly meaningless since it's a
> >> > property of the guest cpu, which we're essentially replacing with the
> >> > qtest stub anyway.
> >>
> >> The stub sits on the same bus the guest cpu would.
> >>
> >> > Furthermore "guest endianness" isn't useful.  With a tiny handful of
> >> > exceptions, all peripherals have their own endianness which is known
> >> > independent of the target.  It makes more sense for test cases to
> >> > explicitly do their accesses in the correct endianness for the device,
> >> > without having to compensate for the fact that it'll be swapped into
> >> > the essentially arbitrary "guest endianness" along the way.
> >>
> >> Here I definitely disagree. I think it makes much more sense
> >> for writes to be "what the guest CPU would write", because that's
> >> where we're injecting them. If we had a test framework where the
> >> test was talking directly to the device, then maybe, but we don't.
> >
> > When I say that guest endianness is not well defined, what I mean is
> > precisely that "what the guest CPU would write" is not well defined.
> > For example on Power, the endianness of a given access will depend on:
> >     - For server chips, whether the CPU is in LE or BE mode
> >     - For embedded chips, the endianness flag in the TLB
> >     - For all chips, whether the plain or byte-reversed load/store
> >       instructions are used
> >
> > [There might even be variants with both a global mode flag and
> > per-page flags in the TLB, I'm not sure]
> 
> AFAIK, these things all take effect inside the guest CPU, and affect
> the value the guest eventually writes to the CPU bus.
> 
> > The model proposed here is that our testcases, on every access specify
> > a specific final-result endianness.  The alternative is that every
> > write from qtest has to do:
> >     1) Start with a value (host endian)
> >     2) Conditional swap for host endian vs. device endian difference
> >     3) Conditional swap for host endian vs. "guest endian" difference
> >        simply to compensate for..
> >     4) [Existing tswap() within writew/writel]
> >        Conditional swap for host endian vs. "guest endian difference
> >
> >> > Basically, the existing byteswaps, instead of removing the need for
> >> > them in the testcase code, instead mean that target-conditional
> >> > byteswaps will be *required* in nearly every testcase.  It's a recipe
> >> > for endianness-unsafe testcases.
> >>
> >> I prefer the swaps to be explicit in the test cases. If your
> >
> > The are explicit in the testcases, in the sense that on every read or
> > write you say this is an LE access or a BE access.  Potentially we
> > could have a "guest native" access option as well, but I think that's
> > useful
> >
> >> test case running on the real CPU would have had to do
> >> "swap to LE and then write this word", so does the test
> >> case in our test framework. If your test case just does
> >> "write the word", then so does the test.
> >>
> >> Most devices IME do not require byteswaps by guest code,
> >
> > What!?  So speaks the person working on a historically LE platform.
> 
> Well, platforms where the devices and the CPU agree about
> endianness.

Which means historically LE platforms, since nearly all commodity,
cross platform hardware is LE.

> > In the kernel, if a driver isn't littered with cpu_to_leXX() it's
> > almost certainly broken on a BE platform.
> >
> > The only devices I can think of which don't have a fixed endianness
> > regardless of platform are:
> >    * legacy virtio: this was an insane design choice, which is why it
> >      went away in virtio-1.0
> >    * graphics card framebuffers, which can usually be configured
> >      either way (or have both BE and LE views of the framebuffer)
> >    * a handful of simple devices, usually from a while back, where
> >      some idiot thought a BE version of a previously LE device (or vice
> >      versa) was a good idea
> >
> >> and I think these functions are confusing -- if you try
> >> to use them to write tests for ARM devices, for instance,
> >> the _le versions of these functions will introduce an
> >> incorrect byteswap on a BE host, even though ARM CPUs and
> >> devices are typically all LE.
> >
> > Huh?  How so?
> 
> This was the result of my mis-reading of the code: I thought
> they were doing a byteswap and then calling the readl/readw/etc
> functions, not the readmem etc functions.

Ah, ok, yes.  That would certainly have been wrong.

> > If you're accessing an LE device (like PCI) you say "write_le" and
> > there are no byteswaps at all on an LE host, and a single byteswap on
> > a BE host, which is just what you want.  For a PCI device "write_le"
> > is the correct option regardless of both host and guest platform.
> >
> > That makes testcases straightforward to write.  As a bonus, it forces
> > the test author to think momentarily about what the right endianness
> > is for each access, which is a good habit to be in (not thinking about
> > this is why we've so often had endianness-broken drivers).
> >
> > Switching to an arbitrary "guest endianness" and back again makes the
> > test more complicated, and accomplishes no extra coverage (because all
> > the swaps under consideration are in the test code anyway).  It gains
> > nothing but confusion.
> 
> I think I need to think about this a bit...
> 
> thanks
> -- PMM
> 

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]