POV-Ray: Newsgroups: povray.off-topic: A min-heap

POV-Ray : Newsgroups : povray.off-topic : A min-heap		Server Time 1 Jul 2025 11:29:17 EDT (-0400)

<<< Previous 10 Messages

Goto Initial 10 Messages

From: Invisible
Subject: Re: A min-heap
Date: 15 Oct 2010 10:14:58
Message: <4cb861e2$1@news.povray.org>

>> In Java, if you ask for an array of Pixel objects (i.e., "Pixel[]"),
>> what you get is an array of *pointers* to Pixel objects.
>
>    I think that in Java it's not completely correct to call them "pointers".

OK, fair enough. My point was simply that you don't get an array of 
Pixel structures, you can an array of some kind of indirect reference to 
Pixel structures. (As you rightly point out, they might be actual 
pointers, or object table indices, or something else entirely...) My 
point was merely that there's (at least one) level of indirection present.

Post a reply to this message

From: Darren New
Subject: Re: A min-heap
Date: 15 Oct 2010 12:24:58
Message: <4cb8805a$1@news.povray.org>

Warp wrote:
> Invisible <voi### [at] devnull> wrote:
>> In Java, if you ask for an array of Pixel objects (i.e., "Pixel[]"), 
>> what you get is an array of *pointers* to Pixel objects.
> 
>   I think that in Java it's not completely correct to call them "pointers".

What Warp said is indeed the standard terminology in textbooks that talk 
about the stuff.

Basically, an "address" is a location indicator the hardware understands. 
(Phrased that way to make "disk address" and "virtual address" and "mapped 
address" all fall under the name "address".)

A "pointer" is an address with a type.  "int*" is a pointer, not an address.

A "reference" is an opaque pointer to an allocated object that may or may 
not be an address.

>   Moreover, in C and C++ objects cannot be moved in memory behind the scenes
> because there may be pointers pointing to them, and hence moving an object
> would break the pointers (and thus the entire program). In Java, however,
> objects can be moved in memory without breaking any existing references to
> them, if the references are not raw memory addresses but something smarter

This however is not *quite* true. C# and (I think) Java both store 
references as pointers internally, without any additional per-reference 
overhead. What lets objects be moved and GCed is the fact that the compiler 
has generated information about where the pointers are in each type and that 
information is reliable (i.e., no unions of references and integers, for 
example). One doesn't need extra indirections. One simply needs to know 
whether any given 4-byte field is a pointer or an integer, which can be 
deduced from information left for the runtime by the compiler, or basically 
what C++ might call RTTI.

The savings that C++ and C save are not storing pointer information for each 
class (and in C there really isn't any corresponding bit of info), not 
tracking auto pointers to the heap, and so on. I.e., there's less 
compile-time space used, and it lets C++ classes with no vtable not need 
extra info somewhere saying what bits are pointers.

The original versions of Smalltalk, however, did indeed use the extra 
indirection. You could have 2^15 objects at a time, even if you had 
megabytes of memory, and there was an extra table somewhere that held the 
run-time pointer from the index to the object location. (The odd-numbered 
pointers were for small integers, hence 2^15 instead of 2^16.)

-- 
Darren New, San Diego CA, USA (PST)
   Serving Suggestion:
     "Don't serve this any more. It's awful."

Post a reply to this message

From: Invisible
Subject: Re: A min-heap
Date: 18 Oct 2010 04:33:25
Message: <4cbc0655@news.povray.org>

>> Moreover, in C and C++ objects cannot be moved in memory behind the
>> scenes
>> because there may be pointers pointing to them, and hence moving an
>> object
>> would break the pointers (and thus the entire program). In Java, however,
>> objects can be moved in memory without breaking any existing
>> references to
>> them, if the references are not raw memory addresses but something
>> smarter
>
> This however is not *quite* true. C# and (I think) Java both store
> references as pointers internally, without any additional per-reference
> overhead. What lets objects be moved and GCed is the fact that the
> compiler has generated information about where the pointers are in each
> type and that information is reliable (i.e., no unions of references and
> integers, for example). One doesn't need extra indirections. One simply
> needs to know whether any given 4-byte field is a pointer or an integer,
> which can be deduced from information left for the runtime by the
> compiler, or basically what C++ might call RTTI.
>
> The original versions of Smalltalk, however, did indeed use the extra
> indirection. You could have 2^15 objects at a time, even if you had
> megabytes of memory, and there was an extra table somewhere that held
> the run-time pointer from the index to the object location. (The
> odd-numbered pointers were for small integers, hence 2^15 instead of 2^16.)

As I understand it, Smalltalk did it by having an "object table". Each 
reference to an object is just an object ID - an index into the object 
table. The object table then stores the pointers to the actual objects. 
That means you can easily move the objects around so long as you update 
the object table. (And you can potentially store additional information 
in the object table too, for example to help the GC.) The down-side is 
an extra level of indirection.

I think before you can talk about what "Java" does, you'd have to 
specify which Java implementation you mean; presumably each JVM can 
implement this stuff any way they choose.

As far as I understand it, what GHC does is this: Each object reference 
is a machine address. When the GC moves everything, it updates all the 
references. (I think that's what Darren is trying to say that Java does.)

The slight disadvantage is that this means that the GC can only be run 
at certain moments. (E.g., when the mutator *isn't* in the middle of 
doing something with a particular reference.) And since GHC has the 
limitation that all Haskell threads must be paused before the GC can 
run, that can sometimes mean that one rogue thread can take forever to 
reach a "safe point" where it can be paused, causing all CPU cores 
except one to sit idle while we wait for a GC cycle.

As you can imagine, this is an efficient way to destroy parallel 
performance. The GHC team are working on a concurrent GC engine that 
doesn't require the mutator to be paused before it can run, which should 
fix this particular issue. (Don't hold your breath though...)

Post a reply to this message

From: Darren New
Subject: Re: A min-heap
Date: 18 Oct 2010 11:29:20
Message: <4cbc67d0$1@news.povray.org>

Invisible wrote:
> As I understand it, Smalltalk did it by having an "object table". Each 
> reference to an object is just an object ID - an index into the object 
> table. The object table then stores the pointers to the actual objects. 
> That means you can easily move the objects around so long as you update 
> the object table. (And you can potentially store additional information 
> in the object table too, for example to help the GC.) The down-side is 
> an extra level of indirection.

Yes. I'm pretty sure the GC info was in the objects, tho.  Having the object 
table also made "become:" quite inexpensive. ( "A become: B" would swap 
every reference to A to be one to B and vice versa. This was important in a 
system where you could change code while things were running and you wanted 
the new class declaration to replace the old one. It was also used to resize 
arrays, which therefore didn't need the extra overhead of a hidden pointer 
pointing to the "real" array.)

> I think before you can talk about what "Java" does, you'd have to 
> specify which Java implementation you mean; presumably each JVM can 
> implement this stuff any way they choose.

Yes, that's the problem. Altho they can't really add CoW simply because 
nobody would take advantage of it by making a bunch of duplicate references 
that they didn't expect to share state. Nor to make strings mutable.

> As far as I understand it, what GHC does is this: Each object reference 
> is a machine address. When the GC moves everything, it updates all the 
> references. (I think that's what Darren is trying to say that Java does.)

Yes. Or at least C#.

-- 
Darren New, San Diego CA, USA (PST)
   Serving Suggestion:
     "Don't serve this any more. It's awful."

Post a reply to this message

From: Orchid XP v8
Subject: Re: A min-heap
Date: 18 Oct 2010 16:44:08
Message: <4cbcb198@news.povray.org>

On 18/10/2010 04:29 PM, Darren New wrote:
> Invisible wrote:
>> As I understand it, Smalltalk did it by having an "object table". Each
>> reference to an object is just an object ID - an index into the object
>> table. The object table then stores the pointers to the actual
>> objects. That means you can easily move the objects around so long as
>> you update the object table. (And you can potentially store additional
>> information in the object table too, for example to help the GC.) The
>> down-side is an extra level of indirection.
>
> Yes. I'm pretty sure the GC info was in the objects, tho.

Yeah, you might be right about that. I didn't look too closely. (I did 
however spend an evening reading through all the GC documentation to see 
how it has different areas for young objects, old objects, large 
objects, persistent objects, object objects... It all seemed quite 
sophisticated.)

> Having the
> object table also made "become:" quite inexpensive.

The thing that occurs to me is that it could potentially make it easy to 
move objects to another address space, hell maybe even a different 
physical machine. I've always thought that would be quite cool. (Of 
course, as soon as you do this, you need to decide where the "best 
place" for each object actually is.)

-- 
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*

Post a reply to this message

From: Darren New
Subject: Re: A min-heap
Date: 18 Oct 2010 17:59:13
Message: <4cbcc331$1@news.povray.org>

Orchid XP v8 wrote:
> Yeah, you might be right about that. I didn't look too closely. (I did 
> however spend an evening reading through all the GC documentation to see 
> how it has different areas for young objects, old objects, large 
> objects, persistent objects, object objects... It all seemed quite 
> sophisticated.)

That was C#. I don't think Smalltalk-80 had any of that stuff. GC gets 
really complicated when you start throwing in destructors, and even more so 
when you start letting the user arbitrarily pin down bits of heap-allocated 
memory and passing it to languages that use non-collectable pointers.

> move objects to another address space, hell maybe even a different 
> physical machine. I've always thought that would be quite cool. 

Yes. There were versions of Smalltalk that used library code to implement 
demand-paging of objects, just catching the DoesNotUnderstand message to 
page back in the object and re-issue the request.  Paging out was creating a 
small proxy and then become: on the proxy.

Lots of cool stuff you could do that way. Change the class declaration while 
the program is running and suddenly all the instances have the new layout, etc.

-- 
Darren New, San Diego CA, USA (PST)
   Serving Suggestion:
     "Don't serve this any more. It's awful."

Post a reply to this message

<<< Previous 10 Messages

Goto Initial 10 Messages