|
|
On 22-8-2009 9:54, Warp wrote:
> clipka <ano### [at] anonymousorg> wrote:
>> I would think that a good optimizing compiler would translate something
>> like "x = -x;" to equally efficient code...?
>
> Out of curiosity, I tested how gcc compiles this function with maximum
> optimizations (-O3 -ffast-math -march=native):
>
> void negate(double* values, unsigned amount)
> {
> for(unsigned i = 0; i < amount; ++i)
> values[i] = -values[i];
> }
>
> The relevant part of the asm output was:
>
> xorl %eax, %eax
> .L7:
> fldl (%edi,%eax,8)
> fchs
> fstpl (%edi,%eax,8)
> addl $1, %eax
> cmpl %eax, %esi
> ja .L7
>
> So it is loading the values into the FPU one by one, negating them there
> and then storing them back into RAM.
For comparison, how would you code the bit reversal routine?
Then explain why that would be faster?
Will that differ on 32 and 64 bit architectures?
Will the result differ for exceptional cases: NaN, inf, subnormal
numbers and 0? If so which implementation is more 'correct'?
Post a reply to this message
|
|