|
|
In article <383D4B2A.B376001E@tidax.se> , Goran Begicevic <gor### [at] tidaxse>
wrote:
> Hey, anybody considered using SIMD instructions embeeded in
> new-generations of processors??
> As far as i know, MMX is worthless, but there are some neat SIMD
> features in new P-III processors that could help.
This issue comes up every month or so, serach a bit back through the
newsgroups and you will find your question answered.
> Now, one of things that costs time to compute is dot-product. And dot
> product is something that is being used *a lot* in raytracing , to say
> at least.
I doubt that just improving the dot product will speed things up in any
noticeable range at all.
> As far as i remember from peering into POV-Ray's source code, it's using
> "double" floating-point numbers. That's something like ~90 bits of
> precision.
By default double uses 64 bits on x86. And there are good reason to have
this precision.
> As soon as i get some time , i'll try to convert POV-Ray dot-product
> algorithm to SIMD and take a look at the results.
This is taken from the AMD 3DNow SDK matrix (thus it is AMDs SIMD FPU
extension, not Intels), but for this purpose it will be enough:
ALIGN 32
PUBLIC _a_dot_vect
_a_dot_vect PROC
movq mm0,[eax]
movq mm3,[edx]
movd mm1,[eax+8]
movd mm2,[edx+8]
pfmul mm0,mm3
pfmul mm1,mm2
pfacc mm0,mm0
pfadd mm0,mm1
ret
_a_dot_vect ENDP
As you can see, making this change is rather trivial. The problems you will
need two versions of POV-Ray, one for AMDs extension and for Intels. Besides
that, in order to use single precision, you will likely have to change the
definition of DBL in the POV-Ray source from double to float. Be aware that
this is not as simple as it might seem...
> It's hard to say how
> it will look like , but my guess is that we don't need additional
> precision of "double" variable too often.
You do. Define DBL as float and watch POV-Ray "hang" in several functions
because of the missing precision.
Thorsten
Post a reply to this message
|
|