|
![](/i/fill.gif) |
clipka wrote:
> Thorsten Froehlich <tho### [at] trf de> wrote:
>>> I haven't really looked at what the gcc sincos library call is doing,
>>> but it might well be that it just executes an fsincos opcode, and that
>>> the time difference is coming from the overhead of the function call.
>> You asked what a fast SSE trigonometry implementation would look like, not
>> what code your compiler generates when targeting a P4. So clearly you should
>> not be looking at the x87 implementation using the fsincos opcode when you
>> want to know how the SSE code would look like!?!
>
> Well, wasn't one of your points that doing trigonometrics in software would be
> more efficient than using dedicated hardware?
It was not my point, it is the point made by AMD and Intel, and also the
approach of pretty much all other CPU vendors. I am the messenger, so don't
shoot me if you don't like the message ;-) And I might add that testing
something - who knows what - on a single seven year old x86 processor and
then claiming Intel and AMD are not saying the truth is a bit "odd"...
Thorsten
Post a reply to this message
|
![](/i/fill.gif) |