|
![](/i/fill.gif) |
Warp wrote:
> Thorsten Froehlich <tho### [at] trf de> wrote:
>> You asked what a fast SSE trigonometry implementation would look like, not
>> what code your compiler generates when targeting a P4. So clearly you should
>> not be looking at the x87 implementation using the fsincos opcode when you
>> want to know how the SSE code would look like!?!
>
> It's obviously telling me that whatever the SSE implementation might be,
> it's *not* faster (nor even equally fast) than the fsincos opcode in my
> computer, which contradicts what you said that it could be done in software
> more efficiently. If it could be done more efficiently, wouldn't gcc do just
> that when I instruct it to use SSE?
Why do you argue with me about what Microsoft, Apple, Intel and AMD say? I
have no intention to discuss this any further, sorry. This is ridiculous! If
you don't know how to get the performance out of your compiled program that
Microsoft, Apple, Intel and AMD say is possible, then that is not my
problem. If you seriously believe Microsoft, Apple, Intel and AMD would make
suggestions how software runs slower on the latest x86 processors, then
believe it, I cannot change what you want to believe.
Thorsten
Post a reply to this message
|
![](/i/fill.gif) |