POV-Ray: Newsgroups: povray.off-topic: Bounding circle intersection: Re: Bounding circle intersection

POV-Ray : Newsgroups : povray.off-topic : Bounding circle intersection : Re: Bounding circle intersection		Server Time 19 Mar 2026 04:38:24 EDT (-0400)

From: Nicolas Alvarez
Date: 10 Dec 2009 20:49:35
Message: <4b21a52f@news.povray.org>

Kevin Wampler wrote:

> I wrote a small bit of code to test what the speed of the various
> approximation methods was.  The code was complied with g++ and the times
> given for each method are for 100000000 calls to the function.  First,
> the functions I tested:
> 
> // Naive method
> inline bool isect1(double aSq, double bSq, double dSq) {
>      return sqrt(dSq) < sqrt(aSq) + sqrt(bSq);
> }
> 
> // Slime's approximate method
> inline bool isect2(double aSq, double bSq, double dSq) {
>      return dSq < aSq + bSq + 2*std::max(aSq,bSq);
> }
> 
> // Suggested by bart, but with a bugfix.
> // Avoids a sqrt, but uses a few multiplies
> inline bool isect3(double aSq, double bSq, double dSq) {
>      double t1 = aSq+bSq;
>      double t2 = dSq - t1
>      return (dSq < t1) || (4*aSq*bSq > t2*t2);
> }
> 
> // Naive method using a fast approximate sqrt
> inline bool isect4(double aSq, double bSq, double dSq) {
>      return fastSqrt(dSq) < fastSqrt(aSq) + fastSqrt(bSq);
> }
> 
> and now the times under various optimization levels:
> 
> 
> ---------- O0
> time for isect1 = 9.66935
> time for isect2 = 2.03143
> time for isect3 = 2.18224
> time for isect4 = 7.03525
> 
> ---------- O1
> time for isect1 = 5.13777
> time for isect2 = 0.413757
> time for isect3 = 0.413756
> time for isect4 = 0.0376559
> 
> ---------- O2
> time for isect1 = 0.564358
> time for isect2 = 0.276049
> time for isect3 = 0.305847
> time for isect4 = 0.0381372
> 
> ---------- O3
> time for isect1 = 0.564529
> time for isect2 = 0.159936
> time for isect3 = 0.206905
> time for isect4 = 0.0376384
> 
> 
> It looks like my initial suggestion implemented in isect4 is the fastest
> when you have optimization turned on, by a margin which rather surprises
> me.  It takes about as long as simply computing the incorrect formula
> 'dSq < aSq + bSq', which seems almost impossibly fast.  Nevertheless
> I've checked the results and it seems to work pretty well despite the
> approximation in the sqrt call.  If you don't want to live with the
> approximation, however, bart's method seems to be the best choice.

How are you doing the "run 100000000 times" part?

The compiler may be optimizing knowing you're runing it on the same input 
values each time. For example, only calling sqrt() 3 times and not 
300000000.

Post a reply to this message