POV-Ray : Newsgroups : povray.off-topic : Question about garbage collection (not a flame) Server Time
4 Nov 2024 17:37:21 EST (-0500)
  Question about garbage collection (not a flame) (Message 1 to 10 of 24)  
Goto Latest 10 Messages Next 10 Messages >>>
From: Warp
Subject: Question about garbage collection (not a flame)
Date: 30 Mar 2008 08:51:43
Message: <47ef9aee@news.povray.org>
Suppose that we have a dynamically allocated object A, which has as
member a reference to a dynamically allocated object B (allocated by A),
which has a reference to a dynamically allocated object C (allocated by B),
and so on. That is, a long chain of objects, each allocating and owning the
next one.

  Now the reference to A in the actual program code dies, so A is not owned
by anything anymore. The GC engine kicks in at some point. Does it free all
those objects, or does it free A only (and when it's run again at some point
in the future, it frees B only, etc)?

  I assume that it frees all the objects, not just A, but how does it do
it efficiently? (I assume this because else in a circular ownership
situation it would never free any of the objects.)
  Assume that this chain of objects is very large, consisting of thousands
of objects.

-- 
                                                          - Warp


Post a reply to this message

From: Orchid XP v8
Subject: Re: Question about garbage collection (not a flame)
Date: 30 Mar 2008 10:03:00
Message: <47efaba4$1@news.povray.org>
Warp wrote:

>   I assume that it frees all the objects, not just A, but how does it do
> it efficiently? (I assume this because else in a circular ownership
> situation it would never free any of the objects.)
>   Assume that this chain of objects is very large, consisting of thousands
> of objects.

The simplest (non-broken) GC implementation is the mark/sweep type.

That is, you mark all objects as "dead". Then, starting from a set of 
"roots" (typically what's referenced from the execution stack, global 
variables, etc.), you mark each object you touch as "alive", and then 
recurse over all the objects it references. (And if any of those are 
already marked as live, you've obviously "been here" before, so ignore 
that path this time around.)

When you've followed every reference in the system and marked everything 
as live, all objects not marked as live must in fact be dead. These can 
now be deleted.

(This is the theory. In practice, it's common for a compacting GC to 
actually *move* each object as it finds it live, and then simply 
deallocate the space used by the remaining objects. Various other 
optimisations are possible too.)

Therefore, if nobody references A, then A will never be marked live, and 
none of the objects it links to will be marked live either. So a single 
pass of the GC will elide them all.

Trouble with mark/sweep is that you have to stop the program executing 
while you do your sweep, or some object X might be unlinked from one 
place and then relinked from somewhere else, and the GC might scan past 
just as this happens, resulting in an object that's actually live being 
marked as dead and deallocated.

Obviously, stopping your entire program (all parallel threads!) so you 
can run the GC isn't a fantastic idea. So various more sophisticated 
schemes have been invented (e.g., generational GC). In any case, for 
most designs of GC, the entire chain of objects will be reclaimed in a 
single pass of the GC. (Although it might be the next "major" rather 
than "minor" pass or something, depending on the exact design details.)

[Note that I do not consider "reference counting" to be a valid GC 
algorithm, since it doesn't actually work properly...]


Post a reply to this message

From: John VanSickle
Subject: Re: Question about garbage collection (not a flame)
Date: 30 Mar 2008 11:37:48
Message: <47efc1dc@news.povray.org>
Orchid XP v8 wrote:
> Obviously, stopping your entire program (all parallel threads!) so you 
> can run the GC isn't a fantastic idea. So various more sophisticated 
> schemes have been invented (e.g., generational GC). In any case, for 
> most designs of GC, the entire chain of objects will be reclaimed in a 
> single pass of the GC. (Although it might be the next "major" rather 
> than "minor" pass or something, depending on the exact design details.)

I could imagine that there is an optimization available when individual 
references go out of scope (although simply deallocating those objects 
should suffice), and in a language that checks these things, there may 
be opportunities whenever a reference is changed.

Not a full-blown check of every dynamic object (which would hurt 
performance), but a limited check.  But I'll leave the details to 
whoever implements it.

In a language that doesn't have built-in GC, properly encapsulating 
references, and properly building destructors, could do most of the GC 
work; but then the point of GC is to allow the programmer to forget 
these things.  But at least I have something to consider if I ever do a 
ground-upwards rewrite of my modeler (which is in C++).

Regards,
John


Post a reply to this message

From: Darren New
Subject: Re: Question about garbage collection (not a flame)
Date: 30 Mar 2008 14:56:28
Message: <47eff06c$1@news.povray.org>
Orchid XP v8 wrote:
> Trouble with mark/sweep is that you have to stop the program executing 
> while you do your sweep, 

The other "problem" is that the process is recursive just as you've run 
out of memory.  You have to be clever about walking through the data 
structures in a way that uses the pointers you're following as stack 
space.  Which, of course, means it's difficult to run the GC in parallel 
with the main processing. :-)

On the other hand, calling malloc() or free() from two separate threads 
in the same heap is likely problematic too, unless you're clever about it.

The way the original Smalltalk-80 worked was to have a 5-bit reference 
count in each object. Things got collected when the reference count went 
to zero. If the reference count went >30, it never got collected until 
you did a mark-sweep, which indeed stopped the whole program.

Note too that reference counting can lock up a program just as bad as 
mark-sweep does, if done foolishly. Ever have a program where exiting it 
made the disk thrash wildly?  Probably using a library that freed 
everything you allocated before exiting.  Think how long you'll thrash 
cleaning up (say) a level of a game with reference counting when you 
dispose the last pointer to a few hundred meg of AI, textures, geometry, 
etc.

> Obviously, stopping your entire program (all parallel threads!) so you 
> can run the GC isn't a fantastic idea.

Erlang has one heap per process, where a program not uncommonly has tens 
of thousands of "processes", not unlike a thread but much lower 
overhead.  (Managed co-routines, perhaps?)

> most designs of GC, the entire chain of objects will be reclaimed in a 
> single pass of the GC. (Although it might be the next "major" rather 
> than "minor" pass or something, depending on the exact design details.)

Yep.

A lot of the newer languages designed to run on massively-parallel 
machines (distributed or not) basically have no aliased values, so GC 
becomes a lot easier when you can only have one pointer. At that point, 
reference counting (or destroying when it goes out of scope) becomes a 
lot easier to design in.

-- 
   Darren New / San Diego, CA, USA (PST)
     "That's pretty. Where's that?"
          "It's the Age of Channelwood."
     "We should go there on vacation some time."


Post a reply to this message

From: Warp
Subject: Re: Question about garbage collection (not a flame)
Date: 31 Mar 2008 01:20:11
Message: <47f0829a@news.povray.org>
Orchid XP v8 <voi### [at] devnull> wrote:
> Warp wrote:

> >   I assume that it frees all the objects, not just A, but how does it do
> > it efficiently?

> The simplest (non-broken) GC implementation is the mark/sweep type.

  That really didn't answer my question. I didn't ask for the simplest
solution.

> [Note that I do not consider "reference counting" to be a valid GC 
> algorithm, since it doesn't actually work properly...]

  Yet many languages use it, such as for example PHP and Objective C.

  It just sounds to me that people object about reference counting because
in theory it's problematic, but some people just go ahead and use it anyways
without problems.

  (Yes, I know that in PHP you can actually get circular references which
get never freed, and that's a known problem, but it doesn't seem to cause
too many problems in practice nevertheless, seeing how popular PHP is.
Circular references seem to be a bit like goto: You can come up with all
kinds of situations where it's "necessary", but in practice it happens
rarely if at all.)

-- 
                                                          - Warp


Post a reply to this message

From: Darren New
Subject: Re: Question about garbage collection (not a flame)
Date: 31 Mar 2008 11:33:52
Message: <47f11270$1@news.povray.org>
Warp wrote:
>   That really didn't answer my question. I didn't ask for the simplest
> solution.

Well, you're asking about a very complex field that's been evolving for 
30 years. And the GC for each language is different.

>> [Note that I do not consider "reference counting" to be a valid GC 
>> algorithm, since it doesn't actually work properly...]
> 
>   Yet many languages use it, such as for example PHP and Objective C.

I'm not sure PHP lets you have a pointer to another object visible at 
the user layer. If you do
   $x['hello'] = $y
as far as the user is concerned, $x['hello'] and $y are two different 
values. So $y['hello'] = $y doesn't form a circular reference, as far as 
I can remember.  I'll try it out as soon as I boot over to Linux.

>   It just sounds to me that people object about reference counting because
> in theory it's problematic, but some people just go ahead and use it anyways
> without problems.

Not without problems. Firefox, for example, is redoing the entire 
plug-in memory management thing because it's too hard for extension 
writers to get right.

>   (Yes, I know that in PHP you can actually get circular references which
> get never freed, and that's a known problem, but it doesn't seem to cause
> too many problems in practice nevertheless, seeing how popular PHP is.

PHP tends to run for one web page, then exit. That's part of the 
solution. Indeed, I ran across this, trying to use PHP in a loop to 
insert a few million records from a file into a database.

Since PHP doesn't actually have pointers, it's possible the runtime is 
doing things to patch up as well.

But saying "it's good enough, who cares if one out of 20 programs leaks 
memory" is a very PHP approach, yah.

> Circular references seem to be a bit like goto: You can come up with all
> kinds of situations where it's "necessary", but in practice it happens
> rarely if at all.)

I disagree. Doubly-linked lists, trees with pointers up the nodes, 
widgets with child widgets that have parent pointers, etc. Pretty much 
everything "OO" is good at winds up generating circular references.

-- 
   Darren New / San Diego, CA, USA (PST)
     "That's pretty. Where's that?"
          "It's the Age of Channelwood."
     "We should go there on vacation some time."


Post a reply to this message

From: Nicolas Alvarez
Subject: Re: Question about garbage collection (not a flame)
Date: 31 Mar 2008 11:41:05
Message: <47f11421$1@news.povray.org>

> I'm not sure PHP lets you have a pointer to another object visible at 
> the user layer. If you do
>   $x['hello'] = $y
> as far as the user is concerned, $x['hello'] and $y are two different 
> values. So $y['hello'] = $y doesn't form a circular reference, as far as 
> I can remember.  I'll try it out as soon as I boot over to Linux.

PHP has "references":

$a =& $b;

This behaves like Unix hardlinks. "$a and $b are completely equal here, 
that's not $a pointing to $b or vice versa, that's $a and $b pointing to 
the same place"


Post a reply to this message

From: Darren New
Subject: Re: Question about garbage collection (not a flame)
Date: 31 Mar 2008 11:46:51
Message: <47f1157b$1@news.povray.org>
Nicolas Alvarez wrote:
> This behaves like Unix hardlinks. "$a and $b are completely equal here, 
> that's not $a pointing to $b or vice versa, that's $a and $b pointing to 
> the same place"

Right. I suspect either they don't get GCed if you wind up with a 
reference to $B inside $B, or there's a mark/sweep mechanism that kicks 
in. Given my experiences, I'd expect it just leaks. :-)

-- 
   Darren New / San Diego, CA, USA (PST)
     "That's pretty. Where's that?"
          "It's the Age of Channelwood."
     "We should go there on vacation some time."


Post a reply to this message

From: Warp
Subject: Re: Question about garbage collection (not a flame)
Date: 31 Mar 2008 12:06:52
Message: <47f11a2b@news.povray.org>
Darren New <dne### [at] sanrrcom> wrote:
> > Circular references seem to be a bit like goto: You can come up with all
> > kinds of situations where it's "necessary", but in practice it happens
> > rarely if at all.)

> I disagree. Doubly-linked lists, trees with pointers up the nodes, 
> widgets with child widgets that have parent pointers, etc. Pretty much 
> everything "OO" is good at winds up generating circular references.

  Yet, as I said, languages like PHP and Objective C seem to use
reference counting without too many problems.

-- 
                                                          - Warp


Post a reply to this message

From: Warp
Subject: Re: Question about garbage collection (not a flame)
Date: 31 Mar 2008 12:08:48
Message: <47f11aa0@news.povray.org>
Darren New <dne### [at] sanrrcom> wrote:
> Right. I suspect either they don't get GCed if you wind up with a 
> reference to $B inside $B, or there's a mark/sweep mechanism that kicks 
> in. Given my experiences, I'd expect it just leaks. :-)

http://www.derickrethans.nl/circular_references.php

  It seems they are working on something called "cyclic tracing" to
alleviate the problem, whatever that may mean.

-- 
                                                          - Warp


Post a reply to this message

Goto Latest 10 Messages Next 10 Messages >>>

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.