POV-Ray : Newsgroups : povray.off-topic : One thing which puzzles me about unix/linux Server Time
1 Oct 2024 13:18:16 EDT (-0400)
  One thing which puzzles me about unix/linux (Message 11 to 18 of 18)  
<<< Previous 10 Messages Goto Initial 10 Messages
From: Darren New
Subject: Re: One thing which puzzles me about unix/linux
Date: 7 Apr 2008 17:02:02
Message: <47fa8bca@news.povray.org>
Tom Austin wrote:
> I would check to see if there is a 'file' for the specific process that 
> reports the current memory usage.

The problem with that approach is you're just polling. There's no good 
way to find the max memory usage if you only use that much for 1/100th 
of a second.

-- 
   Darren New / San Diego, CA, USA (PST)
     "That's pretty. Where's that?"
          "It's the Age of Channelwood."
     "We should go there on vacation some time."


Post a reply to this message

From: Warp
Subject: Re: One thing which puzzles me about unix/linux
Date: 7 Apr 2008 18:01:59
Message: <47fa99d7@news.povray.org>
Darren New <dne### [at] sanrrcom> wrote:
> Tom Austin wrote:
> > I would check to see if there is a 'file' for the specific process that 
> > reports the current memory usage.

> The problem with that approach is you're just polling. There's no good 
> way to find the max memory usage if you only use that much for 1/100th 
> of a second.

  This is a real bummer. I would have wanted accurate memory usage
statistics for the small project I have been making, but it seems to
be impossible.

  If anyone happens to be interested, it's this:
http://warp.povusers.org/FSBAllocator/

  From all the tests I have made regarding the memory usage I'm pretty
confident that with the list-of-ints example my allocator uses half the
memory as the default allocator of libc, but I can't tell for sure
because of all these unknowns, as well as some erratic behavior I noticed
during my tests. Thus I couldn't add this statistic to the page,
unfortunately.

  It's interesting, though. Sometimes C++ *can* actually be slower than
some other languages, and the reason may be in the memory allocator used
by the compiler (or, more specifically, by the standard C libraries).
It's not really so much a fault of C++ per se, but more a fault of the
system library tasked on the memory management.
  Using a more efficient memory allocator can cause amazing speedups, as
I wrote on that page.

-- 
                                                          - Warp


Post a reply to this message

From: Darren New
Subject: Re: One thing which puzzles me about unix/linux
Date: 8 Apr 2008 13:47:48
Message: <47fbafc4$1@news.povray.org>
Warp wrote:
> It's not really so much a fault of C++ per se, but more a fault of the
> system library tasked on the memory management.

Yah.  It *can* be a fault of the standard for a language, tho. For 
example, doesn't C++ require that "destroy" and "free()" basically do 
the same things to the memory allocation? In the sense that "new" isn't 
allowed to look at the type and (say) overallocate a block and do its 
own sub-block management type stuff?  I.e., in the sense that the 
standard library isn't allowed to do some of the stuff you might do 
yourself by overriding "new"?

>   Using a more efficient memory allocator can cause amazing speedups, as
> I wrote on that page.

That's one of the ways GC can be faster than manual memory management. 
Many, many memory management libraries really suck bigtime, while a 
tremendous amount of time is spent making GC fast.

-- 
   Darren New / San Diego, CA, USA (PST)
     "That's pretty. Where's that?"
          "It's the Age of Channelwood."
     "We should go there on vacation some time."


Post a reply to this message

From: Warp
Subject: Re: One thing which puzzles me about unix/linux
Date: 8 Apr 2008 14:55:49
Message: <47fbbfb5@news.povray.org>
Darren New <dne### [at] sanrrcom> wrote:
> >   Using a more efficient memory allocator can cause amazing speedups, as
> > I wrote on that page.

> That's one of the ways GC can be faster than manual memory management. 
> Many, many memory management libraries really suck bigtime, while a 
> tremendous amount of time is spent making GC fast.

  I don't see how GC would help that at all.

  The problem with the default allocator is that it has to be prepared to
allocate blocks of *any* size, and it should do it as memory-efficiently
as possible (ie. wasting as little memory as possible). This makes it
pretty complicated. It needs quite a lot of bookkeeping data, and keeping
track of free blocks of memory is rather complicated.
  This is even further complicated by the fact that the default memory
allocator must be thread-safe. This means that if two threads try to
allocate memory at the same time, the allocator must use mutual exclusion.
Mutual exclusion locks cause considerable overhead. In single-threaded
programs this overhead is useless, but there isn't an easy way around it.

  Why would GC help this? Even GC'd memory must be prepared to allocate
blocks of any size, and it must be thread-safe. I don't think GC would
help these problems.

  If I'm not completely mistaken, the major reasons why my allocator is
so much faster than the default allocator in libc is because it uses a
lot less bookkeeping data (because it only allocates blocks of fixed
size), the bookkeeping is much simpler and a lot faster to manage, and
my allocator is not thread-safe (which can be a big minus in some cases,
of course).

-- 
                                                          - Warp


Post a reply to this message

From: Darren New
Subject: Re: One thing which puzzles me about unix/linux
Date: 8 Apr 2008 21:52:50
Message: <47fc2172$1@news.povray.org>
Warp wrote:

>   The problem with the default allocator is that it has to be prepared to
> allocate blocks of *any* size, and it should do it as memory-efficiently
> as possible (ie. wasting as little memory as possible).

I suspect this latter part is what makes it slower than it could be. Not 
that it's bad. It's just a trade-off.

>   This is even further complicated by the fact that the default memory
> allocator must be thread-safe.

Yeah. If you're willing to over-allocate, you can reduce the contention.

> Mutual exclusion locks cause considerable overhead. In single-threaded
> programs this overhead is useless, but there isn't an easy way around it.

Linking with a different allocator that doesn't try to lock anything? 
:-) Of course, you'd have to *tell* it you're thread-safe. And of 
course, threads aren't part of the language (are they?) while memory 
allocation is, so you're kind of hosed right there.

In Ada, for example, you could have a different heap for each thread, so 
you don't get that kind of lock contention, because Ada knows about 
threads. (And IIRC, the language rules forbid returning a pointer 
allocated in one thread and using it in another, which I guess is the 
bit that actually helps.)

>   Why would GC help this? Even GC'd memory must be prepared to allocate
> blocks of any size, and it must be thread-safe. I don't think GC would
> help these problems.

Allocations can be much faster, because you don't go looking for free 
space until you run out of space.  If you tend to allocate a bunch of 
stuff, then free almost all of it, assuming you don't have 
finalizers/destructors/whateveryoucallthem to worry about, you can clean 
up lots of memory without ever looking at it. (This is not uncommon in 
languages where everything is actually a heap-based object, like 
Smalltalk. You can measure down in the single-digit-percentages of 
objects that last more than a couple of seconds on such systems. Of 
course, in a hybrid system like C++ or Ada or something, one probably 
makes much less garbage because most of the temporary stuff *is* on the 
stack.)

So, no running thru free chains to find memory, and no need to touch 
free memory when you free it, are the primary sources of speed. I.e., 
except for the GC itself, it's much simpler code than malloc and free.

Erlang, for example, has a bunch of different algorithms and a bunch of 
different allocators it switches between on the fly, depending on the 
behavior of your programs. And if I understand correctly, it'll use 
different allocators for different threads in the same process. Since 
threads don't share heaps (at least, not the fast-changing heaps) the 
threading problem isn't there. (They do apparently share heaps of large 
binary objects, as well as the "in-memory database" tables, which are 
also GCed.) And apparently it's too expensive to actually track which 
executable codes are pointed to from the stack, so you can only have two 
versions of the same code at the same time, and if you purge the old 
version, any process running the old version gets aborted first. But 
otherwise, you could just use the return addresses to keep the 
executable code live, except for the inefficiency of that.

So Erlang winds up with stuff like this,
http://erlang.org/doc/man/erts_alloc.html
because it can tell what type it's allocating for. :-) Which is why I 
thought it's a shame the C++ standard says it's essentially not allowed 
to take advantage of type information in "new". Of course, Erlang is 
instrumented out the wazoo, because it's designed for systems you never 
ever turn off, so you really do want to get messages if some program 
starts using too much memory.

>   If I'm not completely mistaken, the major reasons why my allocator is
> so much faster than the default allocator 

Please understand that I wasn't commenting specifically on your 
allocator. Earlier you asked some stuff about GC, so I thought I'd just 
mention some generic GC wisdom here. Nothing in this thread should 
specifically be about any particular implementation of GC or manual 
allocators.

-- 
   Darren New / San Diego, CA, USA (PST)
     "That's pretty. Where's that?"
          "It's the Age of Channelwood."
     "We should go there on vacation some time."


Post a reply to this message

From: Warp
Subject: Re: One thing which puzzles me about unix/linux
Date: 9 Apr 2008 00:47:12
Message: <47fc4a50@news.povray.org>
Darren New <dne### [at] sanrrcom> wrote:
> >   Why would GC help this? Even GC'd memory must be prepared to allocate
> > blocks of any size, and it must be thread-safe. I don't think GC would
> > help these problems.

> Allocations can be much faster, because you don't go looking for free 
> space until you run out of space.

  So GC is basically a selfish and greedy memory allocation engine which
cares only about itself, not anything else in the same system which might
have some use for that wasted memory.
  (Not that this would be anything new to me.)

  Doesn't this also mean that the GC runs a lot more often than would be
strictly necessary? Isn't that kind of counter-productive? Not only is
memory wasted (and thus away from other processes), but part of the speed
benefits in doing so are lost by having to run GC sweeps more often.

-- 
                                                          - Warp


Post a reply to this message

From: Darren New
Subject: Re: One thing which puzzles me about unix/linux
Date: 9 Apr 2008 12:37:59
Message: <47fcf0e7$1@news.povray.org>
Warp wrote:
>   So GC is basically a selfish and greedy memory allocation engine which
> cares only about itself, not anything else in the same system which might
> have some use for that wasted memory.

You could say it's a time-space trade-off.  You know, I almost never 
have a situation where I have two programs running and I'm paging 
because of it. :-)

>   Doesn't this also mean that the GC runs a lot more often than would be
> strictly necessary? 

Assuming you mean "allocating everything on the heap" when you say 
"this"....

> Isn't that kind of counter-productive? 

Only if your language has a stack and isn't functional and is 
single-threaded, I suppose. If your language definition is that the 
stack frames are stored on the heap (for, say, closure purposes), or you 
need to allocate a separate stack of some size for each thread anyway, 
or you tend to overwrite the same memory address with lots of different 
things at different times (unlike, say, LISP, where you tend to build 
new structures reusing parts of old ones instead of overwriting what you 
already have), then allocating lots of short-lived stuff on the heap 
makes sense.

Of course, systems like Smalltalk *do* really and truly have a stack 
(now), exactly for performance reasons. It just doesn't look that way, 
because it'll copy stack frames into the heap if you try to access them 
as "objects".  (Then again, Smalltalk didn't have conditional 
statements, either, in the semantics of the language. Just messages sent 
to booleans along with a couple blocks of text, and a method dispatch 
that evaluated the first block in the True class and the second block in 
the False class, which I thought was pretty cool. :-)

 > Not only is memory wasted (and thus away from other processes),

Remember that the original GC stuff was done on systems (Lisp, 
Smalltalk, etc) where the interpreter *was* the OS, so "taking away from 
other processes" doesn't always make sense.

But yes, some memory is "wasted" for some meaning of the word "wasted". 
Not unlike stack space is "wasted" if you're not about to overflow the 
stack.

It's also "wasted" if you are willing to spend more time on expert 
programmers who never make a mistake and hence never corrupt memory, 
rather than spending more money to buy some RAM. And indeed, when RAM is 
expensive (like in embedded devices), one tends not to see too many 
GC'ed systems.

And with some OS help, you can actually get pretty decent performance 
during collections. The ability to mark pages "manually" to say which 
are more likely to be needed soon, the ability to trap writes to 
particular pages so you can mark them as "the GC will need to examine 
this on the next sweep", the ability to pre-fetch paged-out pages while 
you're processing this one (i.e., read-ahead for page faults), etc.

> but part of the speed
> benefits in doing so are lost by having to run GC sweeps more often.

Running GCs too often is indeed a performance-limiting problem. That's 
one of the reasons that people writing collectors spend so much time 
trying to make them fast, and why there are so many different types of 
collectors (compared to, say, only a couple different kinds of stacks).

-- 
   Darren New / San Diego, CA, USA (PST)
     "That's pretty. Where's that?"
          "It's the Age of Channelwood."
     "We should go there on vacation some time."


Post a reply to this message

From: Darren New
Subject: Re: One thing which puzzles me about unix/linux
Date: 9 Apr 2008 12:47:27
Message: <47fcf31f$1@news.povray.org>
Darren New wrote:
>> Isn't that kind of counter-productive? 
> 
> Only if your language has a stack and isn't functional and is 
> single-threaded, I suppose.

And yes, I agree, Java in this respect is kind of suckful, 
efficiency-wise. I think it's at least as much because they're trying to 
be portable and dynamic loading.

-- 
   Darren New / San Diego, CA, USA (PST)
     "That's pretty. Where's that?"
          "It's the Age of Channelwood."
     "We should go there on vacation some time."


Post a reply to this message

<<< Previous 10 Messages Goto Initial 10 Messages

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.