POV-Ray: Newsgroups: povray.programming: Improved intersection routine for CSG-Intersection objects

POV-Ray : Newsgroups : povray.programming : Improved intersection routine for CSG-Intersection objects		Server Time 14 Jul 2025 17:37:36 EDT (-0400)

<<< Previous 10 Messages

Goto Latest 10 Messages

Next 10 Messages >>>

From: Thorsten Froehlich
Subject: Re: Improved intersection routine for CSG-Intersection objects
Date: 14 Dec 2003 17:40:30
Message: <3fdce6de@news.povray.org>

In article <3fdcd562@news.povray.org> , Warp <war### [at] tagpovrayorg>  wrote:

>   I don't really understand what smart pointers are useful for. In my
> opinion if you need to use a smart pointer, there's a flaw in your
> design.

You clearly haven't done a lot of GUI programming.  If you acquire resources
from the operating system, it is usually a good idea to keep them in a
specialised container able to release the resource should i.e. an exception
occur.  Likewise when it comes to passing data to the operating system.
Frequently this does require a more low-level work on manually allocated
memory with new, and then auto_ptr can be really helpful! *

    Thorsten

____________________________________________________
Thorsten Froehlich, Duisburg, Germany
e-mail: tho### [at] trfde

Visit POV-Ray on the web: http://mac.povray.org

Post a reply to this message

From: Thorsten Froehlich
Subject: Re: Improved intersection routine for CSG-Intersection objects
Date: 14 Dec 2003 17:51:15
Message: <3fdce963@news.povray.org>

In article <3fdce685@news.povray.org> , Wolfgang Wieser <wwi### [at] gmxde>  
wrote:

> They _are_ less efficient. (Because even temporaries trigger allocatation
> on the heap instead of the fast stack and because the con/destructors
> are called all the time increasing and decreasing reference counters,
> etc.)

Huh???  Nothing happens on the heap at all unless you actually need to copy
dynamically allocated memory.  And on modern compilers an assignment will
rarely cause a copy being made.  Return value optimisation will take care of
almost all cases in modern compilers.  Anyway, this really isn't even
necessary as most STL containers are designed to minimise assignments and
copies to cases where it is really necessary.

> But _not_ for real simple types. For "I quickly need a vector", using
> double v[3] is still faster than anything using operator new or
> applying a fancy con/destructor.

Sorry, but if you need a local variable, who said you should use new???  Of
course if one uses new in a brain-dead manner it is easy to make a program
slow, but it is like using malloc for local variables in C - just something
nobody would do.

    Thorsten

____________________________________________________
Thorsten Froehlich, Duisburg, Germany
e-mail: tho### [at] trfde

Visit POV-Ray on the web: http://mac.povray.org

Post a reply to this message

From: Wolfgang Wieser
Subject: Re: Improved intersection routine for CSG-Intersection objects
Date: 14 Dec 2003 17:52:57
Message: <3fdce9c8@news.povray.org>

Thorsten Froehlich wrote:

> In article <3fdcbe02@news.povray.org> , Wolfgang Wieser <wwi### [at] gmxde>
> wrote:
> 
>> Pointers are the most useful thing in C and you cannot get around
>> them in C++ if you want to keep fast performance.
>> All that guided stuff I've heared yet is both slow & fat.
> 
> I don't know which implementation you are refering to, but neither the two
> leading compilers for Windows nor the two leading compilers for Mac OS
> have
> such a problem.  In fact, it is really hard to make auto_ptr use slow
> anywhere if the compiler supports at least the most trivial of
> optimisations...
> 
First of all, the copy constructor and assignment operators will 
introduce overhead:

tempate<classT> 
 auto_ptr<T>& auto_ptr<T>::operator=(auto_ptr<T>& rhs) 
 { 
   if (this != &rhs) 
   { 
     delete pointer;   // <-- if(pointer) { destruct & free }
     pointer = rhs.pointer; 
     rhs.pointer = 0; 
   } 
   return *this; 
 } 

(compared to a simple "copy an integer" which is done at normal 
pointer assignment)

And then, auto_ptr may have some limits in usability because of the 
strict ownership design. Using smart_ptr instead will introduce 
some more overhead. 

Wolfgang

Post a reply to this message

From: Wolfgang Wieser
Subject: Re: Improved intersection routine for CSG-Intersection objects
Date: 14 Dec 2003 18:01:29
Message: <3fdcebc9@news.povray.org>

Thorsten Froehlich wrote:

> In article <3fdce685@news.povray.org> , Wolfgang Wieser <wwi### [at] gmxde>
> wrote:
> 
>> They _are_ less efficient. (Because even temporaries trigger allocatation
>> on the heap instead of the fast stack and because the con/destructors
>> are called all the time increasing and decreasing reference counters,
>> etc.)
> 
> Huh???  Nothing happens on the heap at all unless you actually need to
> copy dynamically allocated memory.  
>
The point was... (see below)

> And on modern compilers an assignment will
> rarely cause a copy being made.  Return value optimisation will take care
> of
> almost all cases in modern compilers.  Anyway, this really isn't even
> necessary as most STL containers are designed to minimise assignments and
> copies to cases where it is really necessary.
> 
For inline functions this is true. Not so for extern linkage. 

The result is lots of inline code which results in larger executables 
and eventually even in slower code. 

>> But _not_ for real simple types. For "I quickly need a vector", using
>> double v[3] is still faster than anything using operator new or
>> applying a fancy con/destructor.
> 
> Sorry, but if you need a local variable, who said you should use new??? 
>
So, if you need a local variable of just such a type we're talking about, 
i.e. a class which simply contains a pointer to an internal class, 
then the class will be allocated on the stack (okay, fast) and the 
internal class will be allocated using operator new on the heap. 

All that happens when you call the constructor and initialize your 
class. For the normal use, this is just what you want because when 
you pass the class to a function, it is quite fast (just increase 
reference counter). But if the object is simply a local var you want 
on the stack, then the overhead is considerably. 

> Of course if one uses new in a brain-dead manner it is easy to make a
> program slow, but it is like using malloc for local variables in C - just
> something nobody would do.
> 
Correct. But if your class behaves that way because you're doing reference 
counting (see above), then there is little you can do against that. 

Wolfgang

Post a reply to this message

From: Wolfgang Wieser
Subject: Re: Improved intersection routine for CSG-Intersection objects
Date: 14 Dec 2003 19:05:43
Message: <3fdcfad6@news.povray.org>

Thorsten Froehlich wrote:

> In article <3fdcbcc2@news.povray.org> , Wolfgang Wieser <wwi### [at] gmxde>
> wrote:
> 
>> I just have to emphasize the statement "if and only if used properly".
>> Usind RTTI and exceptions, it is easy to get a really fat (as of binary
>> size) and slow program much faster than you think.
> 
> No, not at all!  RTTI is something you get for no runtime cost at all and
> the data size it adds is in the range of a few dozen bytes per class.  If
> you can get exceptions for free depends a tiny bit on the architecture
> used,
> but in general the answer is yes as well.  In particular, if you don't
> throw any exceptions inside a function, exceptions will not cost anything.
> 
> The frequently found warning that RTTI and exceptions make programs slow
> is due to very early C++ compilers (we are talking about ten years old or
> more)
> did not implement these features efficently.  This is not the case for
> *any* recent compiler released in the past few years.
> 
I said "slow & fat". 

If exceptions are slower than normal return, that won't hurt much 
because they are meant to be used as "exceptions". So, for exceptions 
my critisism was the size overhead. 

And for RTTI: I cannot imagine that a dynamic_cast has real little 
overhead but I honestly hope that it is the case until I run a test 
in the next days. 

Okay, my observations quoted above were based on my experiences on the 
gcc-2.7.3.2 -> gcc-2.8 transition and may very well be outdated. 

So, I tried again using
gcc (GCC) 3.3.3 20031129 (prerelease)

"Normal compile" means that I compiled with -Os which is -O2 plus 
some minor size optimisations. For the other compile, I used 
-Os -fno-rtti -fno-exceptions. 

I tested RendView, POVRay-3.50c (with PRT patch) and another 
program which uses lots of dynamic alloc and ref counting (AniVision). 
All of these programs neither use RTTI nor exceptions: 

This table compares stripped binary size. 
                             RendView  POVRay  AniVision
Normal compile:               525780   960092   826900
Exceptions and RTTI disabled: 421764   898492   593476
Overhead:                     ca 20%   ca 10%   ca 40%

So, RTTI and exceptions still introduce considerable overhead even if 
not used. I'm not sure which of the two features is to blame for the 
increase and I won't test now because it's too late at night...
(While I would tolerate 10%, I don't want to live with the 40% 
unnecessary code for AniVision.) 

These were just my observations for the impact of disabling exceptions 
and RTTI _when_not_being_used_ in the program. The impact when using 
them is more interesting, however. I'll do some test later (it's too 
late night now). 

Wolfgang

Post a reply to this message

From: Wolfgang Wieser
Subject: Re: Improved intersection routine for CSG-Intersection objects
Date: 14 Dec 2003 19:08:37
Message: <3fdcfb84@news.povray.org>

Wolfgang Wieser wrote:

> This table compares stripped binary size.
>                              RendView  POVRay  AniVision
> Normal compile:               525780   960092   826900
> Exceptions and RTTI disabled: 421764   898492   593476
> Overhead:                     ca 20%   ca 10%   ca 40%
> 
I forgot to mention that the speed penalty is negliable: 
AniVision is 1% slower at "normal compile" -- probably due to 
caching issues and the like (because of larger binary size). 

Wolfgang

Post a reply to this message

From: Thorsten Froehlich
Subject: Re: Improved intersection routine for CSG-Intersection objects
Date: 14 Dec 2003 19:27:49
Message: <3fdd0005$1@news.povray.org>

In article <3fdce9c8@news.povray.org> , Wolfgang Wieser <wwi### [at] gmxde>  
wrote:

> First of all, the copy constructor and assignment operators will
> introduce overhead:
>
> tempate<classT>
>  auto_ptr<T>& auto_ptr<T>::operator=(auto_ptr<T>& rhs)
>  {
>    if (this != &rhs)
>    {
>      delete pointer;   // <-- if(pointer) { destruct & free }
>      pointer = rhs.pointer;
>      rhs.pointer = 0;
>    }
>    return *this;
>  }

No, what you show introduces no overhead when used.  If you would do the
same without an auto_ptr you would need to write exactly the same code.
Consequently, there is no overhead - the code is just created for you rather
than you having to write it.

    Thorsten

____________________________________________________
Thorsten Froehlich, Duisburg, Germany
e-mail: tho### [at] trfde

Visit POV-Ray on the web: http://mac.povray.org

Post a reply to this message

From: Thorsten Froehlich
Subject: Re: Improved intersection routine for CSG-Intersection objects
Date: 14 Dec 2003 19:40:44
Message: <3fdd030c@news.povray.org>

In article <3fdcebc9@news.povray.org> , Wolfgang Wieser <wwi### [at] gmxde>  
wrote:

>> And on modern compilers an assignment will
>> rarely cause a copy being made.  Return value optimisation will take care
>> of
>> almost all cases in modern compilers.  Anyway, this really isn't even
>> necessary as most STL containers are designed to minimise assignments and
>> copies to cases where it is really necessary.
>>
> For inline functions this is true. Not so for extern linkage.

You cannot have external linkage for template functions.

> The result is lots of inline code which results in larger executables
> and eventually even in slower code.

Only if you use containers of many different types!  Modern linkers will
eliminate multiple identical template function instances, thus you only see
the code bloat in the object files, but not in the final linked program.

>>> But _not_ for real simple types. For "I quickly need a vector", using
>>> double v[3] is still faster than anything using operator new or
>>> applying a fancy con/destructor.
>>
>> Sorry, but if you need a local variable, who said you should use new???
>>
> So, if you need a local variable of just such a type we're talking about,
> i.e. a class which simply contains a pointer to an internal class,
> then the class will be allocated on the stack (okay, fast) and the
> internal class will be allocated using operator new on the heap.

But only if there is data.  If properly designed, if there is no data, no
memory should be allocated.  Unless you use fixed-size data in C, you will
have exactly the same amount of work.  And there is nothing that keeps you
from using fixed-size data in C++, except that it is bad style, in both
languages.

> All that happens when you call the constructor and initialize your
> class. For the normal use, this is just what you want because when
> you pass the class to a function, it is quite fast (just increase
> reference counter). But if the object is simply a local var you want
> on the stack, then the overhead is considerably.

Why would you reference count the internal data of most classes?  You seem
to be thinking about some special case, but even then, how would a reference
counting C++ implementation be slower than a similar C implementation?

>> Of course if one uses new in a brain-dead manner it is easy to make a
>> program slow, but it is like using malloc for local variables in C - just
>> something nobody would do.
>>
> Correct. But if your class behaves that way because you're doing reference
> counting (see above), then there is little you can do against that.

Well, you should really not be doing reference counting inside a class in
the first place.  If you share data among several objects of the same class,
most of the time you have a serious design problem.  And again, the
reference counting would not work differently in C, so there is again no
difference in overhead!

    Thorsten

____________________________________________________
Thorsten Froehlich, Duisburg, Germany
e-mail: tho### [at] trfde

Visit POV-Ray on the web: http://mac.povray.org

Post a reply to this message

From: Thorsten Froehlich
Subject: Re: Improved intersection routine for CSG-Intersection objects
Date: 14 Dec 2003 19:52:37
Message: <3fdd05d5@news.povray.org>

In article <3fdcfad6@news.povray.org> , Wolfgang Wieser <wwi### [at] gmxde>  
wrote:

> If exceptions are slower than normal return, that won't hurt much
> because they are meant to be used as "exceptions". So, for exceptions
> my critisism was the size overhead.
>
> And for RTTI: I cannot imagine that a dynamic_cast has real little
> overhead but I honestly hope that it is the case until I run a test
> in the next days.

Effectively you only need it when using multiple inheritance.  Of course, if
you depend on dynamic_cast heavily, you probably have some serious problem
understanding C++.

> Okay, my observations quoted above were based on my experiences on the
> gcc-2.7.3.2 -> gcc-2.8 transition and may very well be outdated.

One of the slowest known compilers around ... and outdated by half a decade.

> So, RTTI and exceptions still introduce considerable overhead even if
> not used.

Shitty compiler => shitty code!  You should really use professional
compilers to base your comparisons on.  Gcc's main feature is portability,
not performance.

    Thorsten

____________________________________________________
Thorsten Froehlich, Duisburg, Germany
e-mail: tho### [at] trfde

Visit POV-Ray on the web: http://mac.povray.org

Post a reply to this message

From: Warp
Subject: Re: Improved intersection routine for CSG-Intersection objects
Date: 14 Dec 2003 21:27:26
Message: <3fdd1c0d@news.povray.org>

Wolfgang Wieser <wwi### [at] gmxde> wrote:
> They _are_ less efficient. (Because even temporaries trigger allocatation 
> on the heap instead of the fast stack and because the con/destructors 
> are called all the time increasing and decreasing reference counters, 
> etc.) 

  How do using pointers directly alleviate the problem of temporaries
triggering allocation on the heap? I don't understand.

  And constructors and destructors are called when instances are copied.
The most common place to copy an instance is when a function takes such
data container by value.
  Usually you don't make functions which take big data containers by value,
but by reference, so no constructors/destructors are called.

> But _not_ for real simple types. For "I quickly need a vector", using 
> double v[3] is still faster than anything using operator new or 
> applying a fancy con/destructor.

  But that's a *static* data type, not a dynamic one. Why would you
want to use a dynamic data container for a table of static size? That's
like shooting flies with a cannon.

-- 
#macro N(D)#if(D>99)cylinder{M()#local D=div(D,104);M().5,2pigment{rgb M()}}
N(D)#end#end#macro M()<mod(D,13)-6mod(div(D,13)8)-3,10>#end blob{
N(11117333955)N(4254934330)N(3900569407)N(7382340)N(3358)N(970)}//  - Warp -

Post a reply to this message

<<< Previous 10 Messages

Goto Latest 10 Messages

Next 10 Messages >>>