|
![](/i/fill.gif) |
Thorsten Froehlich <tho### [at] trf de> wrote:
> > I haven't really looked at what the gcc sincos library call is doing,
> > but it might well be that it just executes an fsincos opcode, and that
> > the time difference is coming from the overhead of the function call.
>
> You asked what a fast SSE trigonometry implementation would look like, not
> what code your compiler generates when targeting a P4. So clearly you should
> not be looking at the x87 implementation using the fsincos opcode when you
> want to know how the SSE code would look like!?!
Well, wasn't one of your points that doing trigonometrics in software would be
more efficient than using dedicated hardware?
Post a reply to this message
|
![](/i/fill.gif) |