qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH v2] target/ppc: Enable hardfloat for PPC


From: BALATON Zoltan
Subject: Re: [RFC PATCH v2] target/ppc: Enable hardfloat for PPC
Date: Fri, 21 Feb 2020 20:52:35 +0100 (CET)
User-agent: Alpine 2.22 (BSF 395 2020-01-19)

On Fri, 21 Feb 2020, Peter Maydell wrote:
On Fri, 21 Feb 2020 at 18:04, BALATON Zoltan <address@hidden> wrote:
On Fri, 21 Feb 2020, Peter Maydell wrote:
I think that is the wrong approach. Enabling use of the host
FPU should not affect the accuracy of the emulation, which
should remain bitwise-correct. We should only be using the
host FPU to the extent that we can do that without discarding
accuracy. As far as I'm aware that's how the hardfloat support
for other guest CPUs that use it works.

I don't know of a better approach. Please see section 4.2.2 Floating-Point
Status and Control Register on page 124 in this document:

https://openpowerfoundation.org/?resource_lib=power-isa-version-3-0

especially the definition of the FR and FI bits and tell me how can we
emulate these accurately and use host FPU.

I don't know much about PPC, but if you can't emulate the
guest architecture accurately with the host FPU, then
don't use the host FPU. We used to have a kind of 'hardfloat'

I don't know if it's possible or not to emulate these accurately and use the FPU but nobody did it for QEMU so far. But if someone knows a way please speak up then we can try to implement it. Unfortunately this would require more detailed knowledge about different FPU implementations (at least X86_64, ARM and PPC that are the mostly used platforms) than what I have or willing to spend time to learn.

support that was fast but inaccurate, but it was a mess
because it meant that most guest code sort of worked but
some guest code would confusingly misbehave. Deliberately
not correctly emulating the guest CPU/FPU behaviour is not
something I want us to return to.

You're right that sometimes you can't get both speed
and accuracy; other emulators (and especially ones
which are trying to emulate games consoles) may choose
to prefer speed over accuracy. For QEMU we prefer to
choose accuracy over speed in this area.

OK, then how about keeping the default accurate but allow to opt in to use FPU even if it's known to break some bits for workloads where users would need speed over accuracy and would be happy to live with the limitation. Note that i've found that just removing the define that disables hardfloat for PPC target makes VMX vector instructions faster while normal FPU is a little slower without any other changes so disabling hardfloat already limits performance for guests using VMX even when not using the FPU for cases when it would cause inaccuracy. If you say we want accuracy and don't care about speed, then just don't disable hardfloat as it helps at least VMX and then we can add option to allow the user to say we can use hardfloat even if it's inaccurate then they can test their workload and decide for themselves.

Regards,
BALATON Zoltan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]