POV-Ray : Newsgroups : povray.off-topic : A tale of two cities : Re: A tale of two cities Server Time
29 Jul 2024 06:16:04 EDT (-0400)
  Re: A tale of two cities  
From: Warp
Date: 13 Mar 2012 13:20:55
Message: <4f5f81f5@news.povray.org>
Invisible <voi### [at] devnull> wrote:
> > In the case of a huffman coder, I really
> > don't understand what do you need abstract classes for. (There *is* such
> > a thing as over-designing, when talking about OOP.)

> The most obvious way to implement this is to physically build a binary 
> tree. That requires either inheritance, or else doing something sly with 
> unions. (Or even being /really/ dirty and just having one node type, and 
> ignoring the unused fields.)

  I don't see why you need different types of objects for such a binary
tree. Each object can be identical to each other in structure. (If in
some nodes some of the fields are unused, so what? It will still be
significantly more efficient than dynamic binding.)

  You don't need to mess with unions unless you really, really need to
make the objects as small as physically possible. However, making them
that small is completely moot if you are going to allocate them
individually anyways (the allocation bookkeeping overhead will nullify
any space saving you could get from shaving off a few bytes using unions.)
It would only make a significant difference if you used arrays of
objects as your tree data structure, rather than allocating objects
individually, and there were tons and tons of elements.

> Now for the /encoding/ part itself, I don't actually need a physical 
> tree. I have a vague recollection that last time I did this, I ended up 
> cheating and just storing a vector of codewords, and manually munging 
> that. It's a lot less elegant, but avoids dynamic allocation. (Or 
> rather, no it doesn't, but it hides it in STL containers, so you don't 
> have to do it manually.)

  Using an array of objects (rather than individually allocated objects)
doesn't completely avoid memory allocation, but it minimizes their amount,
which is often very significant. (In other words, if your program makes
20 allocations instead of a million, it will have a significant impact
on the speed of the program.)

> This application is just supposed to be the "warm up" for my next 
> project, which will involve building complex expression trees. I really 
> don't see how you can get away with no dynamic binding when each node 
> type needs to be processed differently. Unless you go all non-OO and try 
> to implement node type dispatch manually...

  If you really must have nodes of differing types, then it will become
complicated from the point of view of memory management. Lack of garbage
collection can be a b**** sometimes. (Not that it's impossible; it just
requires a lot of care and strict coding practices to get it right. This
is something you usually want to avoid altogether, if possible.)

  Of course your nodes could simply be (smart) pointers of a base class
type, but then you will be making twice as many allocations (because for
each node you will be allocating the node itself, which is a smart pointer,
*and* the object that the node is pointing to), which is even worse from
an efficiency point of view. It becomes simpler like this, but don't expect
extreme speed. (Haskell probably would beat C++ in this situation.)

  It's sometimes quite a bummer, but when writing efficient C++ you have
to always think "how many memory allocations will this perform?" (even,
and especially, when using STL data containers). The less individual
memory allocations are done, the better. (But of course that's not the
only thing that affects speed, but it's the one major thing not related
to the algorithm itself that does.)

  I have to grant that languages like Java are better at this because
there you don't have to worry about how many allocations are performed
(as JVM's have a significantly faster memory allocation than the C/C++
runtime.)

> >    In many cases if you design your program such that you don't need
> > dynamic binding, your program will often become more efficient in terms
> > of both speed and memory usage.

> I guess to some extent it depends on whether your goals are efficiency 
> or maintainability.

  Avoiding dynamic binding does not automatically mean that your program
becomes more complicated and harder to maintain. In fact, it can sometimes
mean the opposite. Sometimes your programs actually become *simpler* and
much easier to understand. Full OOP (beyond simple encapsulation using
classes) is not always the universal panacea for clarity and maintainability.
(Of course this depends on each individual case.)

> Certainly the code I'm trying to write will probably 
> take milliseconds to execute, whichever way I implement it. If I was 
> writing something high-performance, I might care about such things.

  But even then you have to think which design is easiest in terms of
memory management. The less you have to manage memory manually, the
better. Your program becomes safer and simpler.

> The problem is, I want to copy some data, but since I don't know what 
> class it belongs to, I have no idea how big it will be.

  I didn't understand your problem from that description alone.

> >    Now, writing a class that properly manages its dynamically allocated
> > memory is a non-trivial task of its own.

> Is that not just a case of allocating the memory correctly in the 
> constructor, and freeing it properly in the destructor? (I mean, 
> assuming nobody tries to copy the object or free its dynamic memory or 
> anything like that...)

  The simplest design is to delete the allocated memory in the destructor
(it doesn't really matter where inside the class the memory is allocated
as long as the pointer is kept as 0 when there's nothing allocated) and
then disabling the copy constructor and assignment operators. (In C++98
this is done by declaring them private and not implementing them. In C++11
you can explicitly express this with a keyword.)

  Often this is enough. However, sometimes you would want to be able
to copy and assign objects of that class type around. That's where the
non-triviality kicks in.

  The first problem is: How should the object be copied? There are many
possibilities:

- Perform a deep copy. Every time the object is copied or assigned, all
of the data it's managing is copied. This is the easiest way of managing
copying, but it can be pretty inefficient with large amounts of data (and
because each time you make a copy you need an additional 'new').

- Make the class act as a reference to the managed data, in which case each
copy *shares* the managed data. In this case you need to use reference
counting to know when to delete the data. You also have to design your
program taking into account that the object is acting as a reference to
the data, not as the data itself. (This can work pretty well if your
data object is considered immutable.)

- Perform lazy copying. In other words, copying/assigning the object does
not yet deep-copy the data. Only when the data is modified it will be
copied if it was being shared between more than one object. This makes
copying and assigning more efficient, but accessors less.

- The data is *moved* from the original to the copy. This is very efficient
(because you don't need reference counting or anything else) but can be
quite limiting (because you are not actually copying anything, just moving
the data from one object to another).

  The second problem is that implementing these is not completely trivial
and requires attention to detail and knowing the traps.

-- 
                                                          - Warp


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.