|
![](/i/fill.gif) |
Thorsten Froehlich <tho### [at] trf de> wrote:
> You asked what a fast SSE trigonometry implementation would look like, not
> what code your compiler generates when targeting a P4. So clearly you should
> not be looking at the x87 implementation using the fsincos opcode when you
> want to know how the SSE code would look like!?!
It's obviously telling me that whatever the SSE implementation might be,
it's *not* faster (nor even equally fast) than the fsincos opcode in my
computer, which contradicts what you said that it could be done in software
more efficiently. If it could be done more efficiently, wouldn't gcc do just
that when I instruct it to use SSE?
Is SSE different in x86_64 than it is in x86_32?
--
- Warp
Post a reply to this message
|
![](/i/fill.gif) |