Re: [Qemu-devel] [PATCH] x86_64: optimise muldiv64 for x86

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] x86_64: optimise muldiv64 for x86_64 architectu

From:	Frediano Ziglio
Subject:	Re: [Qemu-devel] [PATCH] x86_64: optimise muldiv64 for x86_64 architecture
Date:	Fri, 9 Jan 2015 11:04:18 +0000

2015-01-09 10:35 GMT+00:00 Paolo Bonzini <address@hidden>:
>
>
> On 09/01/2015 11:27, Frediano Ziglio wrote:
>>
>> Signed-off-by: Frediano Ziglio <address@hidden>
>> ---
>>  include/qemu-common.h | 13 +++++++++++++
>>  1 file changed, 13 insertions(+)
>>
>> diff --git a/include/qemu-common.h b/include/qemu-common.h
>> index f862214..5366220 100644
>> --- a/include/qemu-common.h
>> +++ b/include/qemu-common.h
>> @@ -370,6 +370,7 @@ static inline uint8_t from_bcd(uint8_t val)
>>  }
>>
>>  /* compute with 96 bit intermediate result: (a*b)/c */
>> +#ifndef __x86_64__
>>  static inline uint64_t muldiv64(uint64_t a, uint32_t b, uint32_t c)
>>  {
>>      union {
>> @@ -392,6 +393,18 @@ static inline uint64_t muldiv64(uint64_t a, uint32_t b, 
>> uint32_t c)
>>      res.l.low = (((rh % c) << 32) + (rl & 0xffffffff)) / c;
>>      return res.ll;
>>  }
>> +#else
>> +static inline uint64_t muldiv64(uint64_t a, uint32_t b, uint32_t c)
>> +{
>> +    uint64_t res;
>> +
>> +    asm ("mulq %2\n\tdivq %3"
>> +         : "=a"(res)
>> +         : "a"(a), "qm"((uint64_t) b), "qm"((uint64_t)c)
>> +         : "rdx", "cc");
>> +    return res;
>> +}
>> +#endif
>>
>
> Good idea.  However, if you have __int128, you can just do
>
>    return (__int128)a * b / c
>
> and the compiler should generate the right code.  Conveniently, there is
> already CONFIG_INT128 that you can use.
>
> Paolo

Well, it works but in our case b <= c, that is a * b / c is always <
2^64. This lead to no integer overflow in the last division. However
the compiler does not know this so it does the entire (a*b) / c
division which is mainly consists in two integer division instead of
one (not taking into account that is implemented using a helper
function).

I think that I'll write two patches. One implementing using the int128
as you suggested (which is much easier to read that current one and
assembly ones) that another for x86_64 optimization.

Frediano

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH] x86_64: optimise muldiv64 for x86_64 architecture, Frediano Ziglio, 2015/01/09
- Re: [Qemu-devel] [PATCH] x86_64: optimise muldiv64 for x86_64 architecture, Paolo Bonzini, 2015/01/09
  - Re: [Qemu-devel] [PATCH] x86_64: optimise muldiv64 for x86_64 architecture, Frediano Ziglio <=
    - Re: [Qemu-devel] [PATCH] x86_64: optimise muldiv64 for x86_64 architecture, Paolo Bonzini, 2015/01/09
    - Re: [Qemu-devel] [PATCH] x86_64: optimise muldiv64 for x86_64 architecture, Peter Maydell, 2015/01/09
    - Re: [Qemu-devel] [PATCH] x86_64: optimise muldiv64 for x86_64 architecture, Frediano Ziglio, 2015/01/09
- [Qemu-devel] [PATCH] x86_64: optimise muldiv64 for x86_64 architecture, Frediano Ziglio, 2015/01/09
  - Re: [Qemu-devel] [PATCH] x86_64: optimise muldiv64 for x86_64 architecture, Richard Henderson, 2015/01/09

Prev by Date: Re: [Qemu-devel] [PATCH 2/4] qemu-timer: add timer_init and timer_init_ns/us/ms
Next by Date: Re: [Qemu-devel] Question and probable bug in qemu spice's parameters
Previous by thread: Re: [Qemu-devel] [PATCH] x86_64: optimise muldiv64 for x86_64 architecture
Next by thread: Re: [Qemu-devel] [PATCH] x86_64: optimise muldiv64 for x86_64 architecture
Index(es):
- Date
- Thread