POV-Ray: Newsgroups: povray.off-topic: One thing which puzzles me about unix/linux: Re: One thing which puzzles me about unix/linux

POV-Ray : Newsgroups : povray.off-topic : One thing which puzzles me about unix/linux : Re: One thing which puzzles me about unix/linux		Server Time 12 Jul 2025 16:13:49 EDT (-0400)
From: Darren New
Date: 8 Apr 2008 21:52:50
Message: <47fc2172$1@news.povray.org>
Warp wrote:

>   The problem with the default allocator is that it has to be prepared to
> allocate blocks of *any* size, and it should do it as memory-efficiently
> as possible (ie. wasting as little memory as possible).

I suspect this latter part is what makes it slower than it could be. Not 
that it's bad. It's just a trade-off.

>   This is even further complicated by the fact that the default memory
> allocator must be thread-safe.

Yeah. If you're willing to over-allocate, you can reduce the contention.

> Mutual exclusion locks cause considerable overhead. In single-threaded
> programs this overhead is useless, but there isn't an easy way around it.

Linking with a different allocator that doesn't try to lock anything? 
:-) Of course, you'd have to *tell* it you're thread-safe. And of 
course, threads aren't part of the language (are they?) while memory 
allocation is, so you're kind of hosed right there.

In Ada, for example, you could have a different heap for each thread, so 
you don't get that kind of lock contention, because Ada knows about 
threads. (And IIRC, the language rules forbid returning a pointer 
allocated in one thread and using it in another, which I guess is the 
bit that actually helps.)

>   Why would GC help this? Even GC'd memory must be prepared to allocate
> blocks of any size, and it must be thread-safe. I don't think GC would
> help these problems.

Allocations can be much faster, because you don't go looking for free 
space until you run out of space.  If you tend to allocate a bunch of 
stuff, then free almost all of it, assuming you don't have 
finalizers/destructors/whateveryoucallthem to worry about, you can clean 
up lots of memory without ever looking at it. (This is not uncommon in 
languages where everything is actually a heap-based object, like 
Smalltalk. You can measure down in the single-digit-percentages of 
objects that last more than a couple of seconds on such systems. Of 
course, in a hybrid system like C++ or Ada or something, one probably 
makes much less garbage because most of the temporary stuff *is* on the 
stack.)

So, no running thru free chains to find memory, and no need to touch 
free memory when you free it, are the primary sources of speed. I.e., 
except for the GC itself, it's much simpler code than malloc and free.

Erlang, for example, has a bunch of different algorithms and a bunch of 
different allocators it switches between on the fly, depending on the 
behavior of your programs. And if I understand correctly, it'll use 
different allocators for different threads in the same process. Since 
threads don't share heaps (at least, not the fast-changing heaps) the 
threading problem isn't there. (They do apparently share heaps of large 
binary objects, as well as the "in-memory database" tables, which are 
also GCed.) And apparently it's too expensive to actually track which 
executable codes are pointed to from the stack, so you can only have two 
versions of the same code at the same time, and if you purge the old 
version, any process running the old version gets aborted first. But 
otherwise, you could just use the return addresses to keep the 
executable code live, except for the inefficiency of that.

So Erlang winds up with stuff like this,
http://erlang.org/doc/man/erts_alloc.html
because it can tell what type it's allocating for. :-) Which is why I 
thought it's a shame the C++ standard says it's essentially not allowed 
to take advantage of type information in "new". Of course, Erlang is 
instrumented out the wazoo, because it's designed for systems you never 
ever turn off, so you really do want to get messages if some program 
starts using too much memory.

>   If I'm not completely mistaken, the major reasons why my allocator is
> so much faster than the default allocator 

Please understand that I wasn't commenting specifically on your 
allocator. Earlier you asked some stuff about GC, so I thought I'd just 
mention some generic GC wisdom here. Nothing in this thread should 
specifically be about any particular implementation of GC or manual 
allocators.

-- 
   Darren New / San Diego, CA, USA (PST)
     "That's pretty. Where's that?"
          "It's the Age of Channelwood."
     "We should go there on vacation some time."
Post a reply to this message