  | 
  | 
 
 | 
  | 
 | 
  | 
 
 |   |  
 |   |  
 | 
  | 
From: Ryan Lamansky 
Subject: Alternative Box Intersection Algorithm 
Date: 14 Sep 2004 14:13:16 
Message: <414734bc@news.povray.org> 
 | 
 
 
 | 
  | 
 
 |   |  
 |   |  
 | 
  | 
In my studies of POV-Ray's source, I've discovered an alternative box 
intersection algorithm.  This one is far less branchy and far less prone 
to floating-point floatiness problems.
The only C++ compiler I have is Microsoft's (VS.NET 2003), so I'm not 
able to compare the speed with the official version (which uses Intel's 
compiler).  Theoretically, it should be faster because it's so much 
simpler.  Some casual tests on image quality (like StackerDay.pov) 
indicate that this has at least the same quality as the old code.
This is a complete replacement for the body of the Intersect_Box 
function in boxes.cpp.
Thoughts and opinions are welcome :)
-Ryan
int smin = 0, smax = 0;    /* Side hit for min/max intersection. */
   DBL t, tmin, tmax;
   VECTOR P, D;
   /* Transform the point into the boxes space */
   if (Trans != NULL)
   {
     MInvTransPoint(P, Ray->Initial, Trans);
     MInvTransDirection(D, Ray->Direction, Trans);
   }
   else
   {
     Assign_Vector(P, Ray->Initial);
     Assign_Vector(D, Ray->Direction);
   }
   DBL tYMin, tYMax, tZMin, tZMax;
           //Based on http://www.cs.utah.edu/~rmorley/pubs/box.pdf
           //Credits go to Amy L. Williams, Steve Barrus,
           //R. Keith Morley, and Peter Shirly
           //for developing this technique.
           // Adapted for use with POV-Ray by Ryan Lamansky
           smin = SIDE_X_0;
           smax = SIDE_X_1;
           DBL //Provides a free way out of the negative zero problem.
                    divx = 1/D[X],
                    divy = 1/D[Y];
           if (divx >= 0)
           {
                    tmin = (Corner1[X] - P[X]) * divx;
                    tmax = (Corner2[X] - P[X]) * divx;
           }
           else
           {
                    tmin = (Corner2[X] - P[X]) * divx;
                    tmax = (Corner1[X] - P[X]) * divx;
           }
           if (divy >= 0)
           {
                    tYMin = (Corner1[Y] - P[Y]) * divy;
                    tYMax = (Corner2[Y] - P[Y]) * divy;
           }
           else
           {
                    tYMin = (Corner2[Y] - P[Y]) * divy;
                    tYMax = (Corner1[Y] - P[Y]) * divy;
           }
           if (tmin > tYMax || tYMin > tmax)
                    return false;
           if (tYMin > tmin)
           {
                    tmin = tYMin;
                     smin = SIDE_Y_0;
           }
           if (tYMax < tmax)
           {
                    tmax = tYMax;
                    smax = SIDE_Y_1;
           }
           DBL divz = 1/D[Z];
           if (divz >= 0)
           {
                    tZMin = (Corner1[Z] - P[Z]) * divz;
                    tZMax = (Corner2[Z] - P[Z]) * divz;
           }
           else
           {
                    tZMin = (Corner2[Z] - P[Z]) * divz;
                    tZMax = (Corner1[Z] - P[Z]) * divz;
           }
           if (tmin > tZMax || tZMin > tmax)
                    return false;
           if (tZMin > tmin)
           {
                    tmin = tZMin;
                    smin = SIDE_Z_0;
           }
           if (tZMax < tmax)
           {
                    tmax = tZMax;
                    smax = SIDE_Z_1;
           }
   *Depth1 = tmin;
   *Depth2 = tmax;
   *Side1 = smin;
   *Side2 = smax;
           return tmax > EPSILON; //Prevents same-point self-shadowing.
 Post a reply to this message 
 | 
  | 
 
 |   |  
 |   |  
 | 
  | 
 | 
  | 
 
 |   |  
 |   |  
 | 
  | 
Ryan Lamansky wrote:
> In my studies of POV-Ray's source, I've discovered an alternative box 
> intersection algorithm.  This one is far less branchy and far less prone 
> to floating-point floatiness problems.
> 
> The only C++ compiler I have is Microsoft's (VS.NET 2003), so I'm not 
> able to compare the speed with the official version (which uses Intel's 
> compiler).
Let me get this straight - you can't compare the speed because you can 
only compile your patched version but not the unmodified code?
Concerning your code - you do:
   divx = 1/D[X]
without checking if D[X] is zero for example (the paper you cited 
clearly states that it relies on the IEEE floating point standard for this).
Christoph
-- 
POV-Ray tutorials, include files, Sim-POV,
HCR-Edit and more: http://www.tu-bs.de/~y0013390/
Last updated 06 Jul. 2004 _____./\/^>_*_<^\/\.______
 
 Post a reply to this message 
 | 
  | 
 
 |   |  
 |   |  
 | 
  | 
 | 
  | 
 
 |   |  
 |   |  
 | 
  | 
Ryan Lamansky <Spa### [at] kardax com> wrote:
> Theoretically, it should be faster because it's so much simpler.
  Code simplicity is in no way a measurement of speed. (For example,
insertion sort is much simpler than quicksort, yet it's also much
slower.)
  I'm not saying your code is not faster, I'm just saying that you
should be careful on what you claim. :)
-- 
#macro M(A,N,D,L)plane{-z,-9pigment{mandel L*9translate N color_map{[0rgb x]
[1rgb 9]}scale<D,D*3D>*1e3}rotate y*A*8}#end M(-3<1.206434.28623>70,7)M(
-1<.7438.1795>1,20)M(1<.77595.13699>30,20)M(3<.75923.07145>80,99)// - Warp -
 
 Post a reply to this message 
 | 
  | 
 
 |   |  
 |   |  
 | 
  | 
 | 
  | 
 
 |   |  
 |   |  
 | 
  | 
Christoph Hormann wrote:
> Let me get this straight - you can't compare the speed because you can 
> only compile your patched version but not the unmodified code?
I can compile the original and the modified version of POV-Ray with 
VS.NET 2003... but the results are inconclusive.  In my limited testing, 
both builds look exactly the same, perform exactly the same (even with 
the Vista Buffer off), and still lose to the Official version.
Despite much effort with compiler tweaking, I've never gotten a VS.NET 
2003 build of POV-Ray 3.6 that defeated an official release.
> Concerning your code - you do:
> 
>   divx = 1/D[X]
> 
> without checking if D[X] is zero for example (the paper you cited 
> clearly states that it relies on the IEEE floating point standard for 
> this).
Yes.  The result of a 1.0/0.0 in IEEE floating point is 
PositiveInfinity.  The math related to infinite values is interesting, 
but doesn't break the algorithm in any way.
-Ryan
 
 Post a reply to this message 
 | 
  | 
 
 |   |  
 |   |  
 | 
  | 
 | 
  | 
 
 |   |  
 |   |  
 | 
  | 
Ryan Lamansky wrote:
> 
> 
> Yes.  The result of a 1.0/0.0 in IEEE floating point is 
> PositiveInfinity.  The math related to infinite values is interesting, 
> but doesn't break the algorithm in any way.
But it does not work on non-IEEE floating point systems.   The C 
language standards do not guarantee this to work like you expect it.
Christoph
-- 
POV-Ray tutorials, include files, Sim-POV,
HCR-Edit and more: http://www.tu-bs.de/~y0013390/
Last updated 06 Jul. 2004 _____./\/^>_*_<^\/\.______
 
 Post a reply to this message 
 | 
  | 
 
 |   |  
 |   |  
 | 
  | 
 | 
  | 
 
 |   |  
 |   |  
 | 
  | 
>   Code simplicity is in no way a measurement of speed. (For example,
> insertion sort is much simpler than quicksort, yet it's also much
> slower.)
That's true, but in this case there simply isn't much work going on.  A 
few ifs, a few comparisons, and a few math operations.  There's really 
no way you can look at it and claim that it's slow :)
>   I'm not saying your code is not faster, I'm just saying that you
> should be careful on what you claim. :)
When I initially discovered this algorithm, I ran some tests on both the 
  alternative and the original in C#.  The alternative one was more than 
twice as fast in that environment.  Since the Intersect_Box function 
could be called millions of times in a render, I was hoping to see 
_some_ kind of improvement when converted to POV-Ray.
In spite of the mysterious absence of any performance difference, I 
still think it's "good" code because it's much simpler, and it's much 
more accurate since the EPSILON test is delayed until the very end.  A 
possible drawback is that IEEE floating point behavior is required--I 
don't have enough experience with the various POV-Ray-supported 
platforms to know if this is a problem.
-Ryan
 
 Post a reply to this message 
 | 
  | 
 
 |   |  
 |   |  
 | 
  | 
 | 
  | 
 
 |   |  
 |   |  
 | 
  | 
> But it does not work on non-IEEE floating point systems.   The C 
> language standards do not guarantee this to work like you expect it.
How about C++?
(I'm not trying to be sarcastic--I really don't know)
-Ryan
 
 Post a reply to this message 
 | 
  | 
 
 |   |  
 |   |  
 | 
  | 
 | 
  | 
 
 |   |  
 |   |  
 | 
  | 
Ryan Lamansky <Spa### [at] kardax com> wrote:
> > But it does not work on non-IEEE floating point systems.   The C 
> > language standards do not guarantee this to work like you expect it.
> How about C++?
  Most of the C standard holds for C++ as well.
-- 
plane{-x+y,-1pigment{bozo color_map{[0rgb x][1rgb x+y]}turbulence 1}}
sphere{0,2pigment{rgbt 1}interior{media{emission 1density{spherical
density_map{[0rgb 0][.5rgb<1,.5>][1rgb 1]}turbulence.9}}}scale
<1,1,3>hollow}text{ttf"timrom""Warp".1,0translate<-1,-.1,2>}//  - Warp -
 
 Post a reply to this message 
 | 
  | 
 
 |   |  
 |   |  
 | 
  | 
From: Thorsten Froehlich 
Subject: Re: Alternative Box Intersection Algorithm 
Date: 15 Sep 2004 14:51:18 
Message: <41488f26@news.povray.org> 
 | 
 
 
 | 
  | 
 
 |   |  
 |   |  
 | 
  | 
In article <41476f28$1@news.povray.org> , Ryan Lamansky <Spa### [at] kardax com>
wrote:
> In spite of the mysterious absence of any performance difference, I
> still think it's "good" code because it's much simpler, and it's much
> more accurate since the EPSILON test is delayed until the very end.  A
> possible drawback is that IEEE floating point behavior is required--I
> don't have enough experience with the various POV-Ray-supported
> platforms to know if this is a problem.
The algorithm you show is hardly any different for the miss case.  It
exchanges two divisions for one division and two multiplications in the best
case, but exchanges one divisions against two divisions and one
multiplication in the worst case for the first if/else block alone.
And all your code assumes division is so much slower that it can offset the
difference in the average case.  I seriously doubt it can by a serious
margin except for specially constructed cases because the average case is a
miss of a bounding box in a reasonably complex scene.
Your point about EPSILON is, well, pointless because you trade the division
by zero for the EPSILON comparison, which produces little if any gain in
precision.
However, I do agree that the complexity of the Z if/else with its case could
be better implemented.  This is something the algorithm you show has
optimized very well.  However, this is also the code most infrequently
executed.
I think the main reason you see no real difference is because there simply
is none to be measured.  The code parts that are better are not important
enough to show a difference in complex scenes, and simple scenes render fast
already :-)
If you are interested in optimizing POV-Ray, I would recommend to profile
first to find where most time is wasted.
    Thorsten
____________________________________________________
Thorsten Froehlich, Duisburg, Germany
e-mail: tho### [at] trf de
Visit POV-Ray on the web: http://mac.povray.org
 
 Post a reply to this message 
 | 
  | 
 
 |   |  
 |   |  
 | 
  | 
From: Andrew C on Mozilla 
Subject: Re: Alternative Box Intersection Algorithm 
Date: 15 Sep 2004 15:24:10 
Message: <414896da$1@news.povray.org> 
 | 
 
 
 | 
  | 
 
 |   |  
 |   |  
 | 
  | 
>   Most of the C standard holds for C++ as well.
Most?? :-|
OK, I'm scared...
Andrew @ home.
Oh... wait a sec... I don't *use* C or C++... *relief*
 
 Post a reply to this message 
 | 
  | 
 
 |   |  
 |   |  
 | 
  | 
 | 
  | 
 
 |   |  
 
 | 
  |