POV-Ray : Newsgroups : povray.unofficial.patches : SSE2 optymalization of Intersect_Triangle function : Re: SSE2 optymalization of Intersect_Triangle function Server Time
9 Jun 2024 01:38:45 EDT (-0400)
  Re: SSE2 optymalization of Intersect_Triangle function  
From: raven
Date: 26 Jan 2005 11:00:00
Message: <web.41f7bd6a4017f8d86e5b7ea20@news.povray.org>
We test this function with one big triangle. Here is a Pov-Ray code for it:
------------------------------------------------
#include "colors.inc"
camera {location <2.0 , 0.0 , 0.0>    # dla X
//camera {location <2.0 , 0.0 , 0.0>  # dla Y
//camera {location <2.0 , 0.0 , 0.0>  # dla Z
        look_at  <0.0 , 0.0 , 0.0>}
light_source{<1,2,-2> color White}

triangle {
 <-11,-6,-8>,<-11,6,0>,<-11,-6,8>   # dla X
 //<-8,-11,-6>,<0,-11,6>,<8,-11,-6> # dla Y
 //<-8,-6,-11>,<0,6,-11>,<8,-6,-11> # dla Z
 texture{
  pigment{color rgb<1,0.5,0>}
  finish{ambient 0.15 diffuse 0.85}
 }
}
------------------------------------------------

This triangle in standard resolution (300x300) run Intersect_Triangle 109676
times and return true 32678. We test every part (X, Y, Z) separately
because of some little differences in code.
We run orginal gcc function and our optimized one 1000 times and count numer
of processor cycles. Next we choose 100 best results for both functions and
calculate an average numer of cycles. Here are our results:

gcc verion for X: 74,1 milions cycles
sse2 version for X: 59,2 milions cycles (20,10 % beter)
gcc verion for X: 76,3 milions cycles
sse2 version for X: 61,0 milions cycles (20,00 % beter)
gcc verion for X: 75,5 milions cycles
sse2 version for X: 59,8 milions cycles (20,79 % beter)

Mainly we minimalize numer of reads from memory and generally numer of
instructions.


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.