POV-Ray : Newsgroups : povray.programming : SIMD implementation of dot-product in POV-Ray??? : Re: SIMD implementation of dot-product in POV-Ray??? Server Time
28 Jul 2024 18:24:09 EDT (-0400)
  Re: SIMD implementation of dot-product in POV-Ray???  
From: Thorsten Froehlich
Date: 27 Nov 1999 11:24:50
Message: <384005d2@news.povray.org>
In article <383FCB6A.B6ABEC10@tidax.se> , Goran Begicevic <gor### [at] tidaxse>
wrote:

>>
>> This issue comes up every month or so, serach a bit back through the
>> newsgroups and you will find your question answered.
>

> the conclusion on this issue in older threads?n

In short, that the precision is not good enough. In addition, improving high
level algorithms usually gives a more significant speedup without having to
use assembler.

>> I doubt that just improving the dot product will speed things up in any
>> noticeable range at all.
>
> Well, run POV in profiler and take a look where it's spending most of
> it's time.

Hmm, did you ever do that?  A profiler will show you in which functions the
time is spend, but all vector operations in POV-Ray are macros.
Whenever I profiled, I found that POV-Ray spends a lot of time doing memory
allocations...

> Now, i'm not so assembler-skilled. How wide is mm0,1,2,3 register? Is
> this done on 32-bit 'float' variables?

Yes, all the SIMD FPU instructions are on 32 bit floats, there are no 64 bit
float SIMD instructions.

> As far as i heard, Intels implementation of dot-product is even more
> 'automated' so you don't need to multiply registers 'by hand'. It's all
> being done in one command.

I am not very familiar with x86 assembler.

>> You do.  Define DBL as float and watch POV-Ray "hang" in several functions
>> because of the missing precision.
>
> Note that this is not my idea of how this should be done. I would keep
> all calculations as they are, and just rewrite dot-product funtion.
>
> 'double' would be converted into float prior to calculations and then
> converted back.

I am not sure if you can easily move data from the SISD FPU to the SIMD FPU
registers, that might take up more time than the actual SISD calculation.

> Well, we'll never know if we never try, right?

Well, of course there is nothing from keeping you to try it.  Just don't be
to disappointed if you don't see any speedup.


       Thorsten


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.