POV-Ray : Newsgroups : povray.off-topic : My hypothesis : Re: My hypothesis Server Time
29 Jul 2024 20:23:53 EDT (-0400)
  Re: My hypothesis  
From: Invisible
Date: 6 Sep 2011 05:29:47
Message: <4e65e80b$1@news.povray.org>
On 05/09/2011 07:45 PM, Warp wrote:
> Orchid XP v8<voi### [at] devnull>  wrote:
>>     insert :: Ord x =>  x ->  MinHeap x ->  MinHeap x
>>     insert x' (Leaf        ) = Node x' Leaf Leaf
>>     insert x' (Node x h0 h1) = Node (min x x') (insert (max x x') h1) h0
>
>    This is a good example of where it becomes confusing. Even after
> studying it for a few minutes I can't really figure out what is it
> that it's doing.

The first line is a type signature. It says

   "For any type x which has ordering operations, this function takes an 
x and a MinHeap x and returns a MinHeap x."

The next two lines define the function. Each line conceptually means "if 
the function call looks like the thing on the left, the result looks 
like the thing on the right".

Therefore, if I were to write

   insert 5 Leaf

it should be just as if I had written

   Node 5 Leaf Leaf

The general rule of Haskell is that any name that begins with a *lower 
case* letter is a variable name. (This isn't merely a convention, it is 
formally part of the actual language syntax.) This is why "x" is a 
variable name, but "Leaf" is not.

(If you're paying attention, you'll notice that "insert" is a variable 
name by these rules. In Haskell, a named function is simply a named 
variable who's value just happens to be a function. Which is kinda 
weird, but consistent.)

So, textually, insert 5 Leaf means Node 5 Leaf Leaf. Operationally, what 
does that actually /mean/? Well, in more typical syntax it would be 
something like

   MinHeap insert(x)
   {
     MinHeap out = new MinHeap_Node();
     out.datum = x';
     out.child0 = Leaf.singleton;
     out.child1 = Leaf.singleton;
     return out;
   }

(Since "Leaf" has no data fields, it is essentially a global constant.) 
In other words, we're constructing a new Node object and setting it to 
point to the new datum we're inserting, and setting both of its children 
to be the empty heap.

So how about that last line?

   insert x' (Node x h0 h1) = ...

So if you're inserting into a heap that /isn't/ empty, then

   x' is the thing you want to insert.
   x is the datum in the root node of the heap.
   h0 is the left child of the root node.
   h1 is the right child of the root node.

Given that, the right-hand side says

   ... = Node (min x x') (insert (max x x') h1) h0

That parses basically as

   ... = new Node(min(x, x'), insert(max(x, x'), h1), h0);

In other words, create a new heap node, where the datum is x or x', 
whichever is lowest. The left child is the result of running 
insert(max(x, x'), h1) - in other words, recursively insert x or x' 
(whichever is highest) into h1. The right child is h0.

So, we compare the existing root datum to the new datum, and keep the 
lowest one in the new root node. We then recursively insert the other 
datum into one of the child heaps.

Notice carefully how h0 and h1 get swapped around. This keeps the heap 
balanced. (If you stare at delete_top, you'll see it also swaps the 
children around - and deletes from the other side.) I haven't formally 
proven it, but I /think/ this arrangement keeps the tree balanced for 
all valid operation sequences. Certainly it does if you repeatedly 
insert items; I'm not 100% sure about interleaved inserts and deletes.



So really, there's two ways to interpret Haskell functions. You can view 
each one as a textual substitution. Given pencil and paper, you can 
expand out the work that each one does. For example:

   insert 1 (insert 2 (insert 3 (insert 4 empty)))
   insert 1 (insert 2 (insert 3 (insert 4 Leaf))) [Because empty = Leaf]
   insert 1 (insert 2 (insert 3 (Node 4 Leaf Leaf)))
   insert 1 (insert 2 (Node 3 (insert 4 Leaf) Leaf))
   insert 1 (insert 2 (Node 3 (Node 4 Leaf Leaf) Leaf))
   insert 1 (Node 2 (insert 3 Leaf) (Node 4 Leaf Leaf))
   insert 1 (Node 2 (Node 3 Leaf Leaf) (Node 4 Leaf Leaf))
   Node 1 (insert 2 (Node 4 Leaf Leaf)) (Node 3 Leaf Leaf)
   Node 1 (Node 2 (insert 4 Leaf) Leaf) (Node 3 Leaf Leaf)
   Node 1 (Node 2 (Node 4 Leaf Leaf) Leaf) (Node 3 Leaf Leaf)

It's not hard to throw together a little program that shunts text around 
like this. (Although the brackets are fiddly if you want to do it right.)

If you draw a little graph, the final expression there looks like

                     Node 1
         Node 2                 Node 3
     Node 4   Leaf           Leaf    Leaf
   Leaf Leaf

Which is a balanced tree. (As balanced as a tree containing 4 items can 
be, anyway.)

So textually, that's how to run Haskell. Operationally, the thing on the 
left checks which "constructor" is present (i.e., whether a node is a 
Leaf or a Node), and then binds variables to things. It then constructs 
the value described on the right.

It's not shown in my example, but the notation is quite powerful. It 
lets you do things like inspect multiple levels of the tree:

   rotate :: MinHeap x -> MinHeap x
   rotate (Node x1 (Node x2 c1 c2) c3) = Node x2 c1 (Node x1 c2 c3)

That's a tree rotation [which obviously isn't a valid thing to do to a 
min-heap tree, but you get the idea]. Every time "Node" appears on the 
left, it means "this part has to be a Node object". Every time it 
appears on the right, it means "please construct a new Node object".

Once you get over the notation, it's actually a very clear and succinct 
way to work. Consider the alternative:

   function rotate(root)
   {
     if (root.type == "node")
     {
       x1 = root.datum;
       c3 = root.child1;
       child = root.child0;
       if (child.type == "node")
       {
         x2 = child.datum;
         c1 = child.child0;
         c2 = child.child1;
         return new Node(x2, c1, new Node(x1, c2, c3));
       }
     }
   }

All that code to do the same thing as the 1-liner above. Not only is the 
Haskell version shorter, it's clearer: It says "my input must look like 
THIS, and then I will change it to look like THAT". You don't have to 
mentally execute a series of one-at-a-time baby steps to see what's 
happening. It's a bit like

   temp1 = 1 + 2;
   temp2 = 3 + 4;
   temp3 = temp1 * temp2;
   temp4 = temp3 ^ 2;

verses

   temp4 = ((1+2) * (3+4))^2;

The latter is not only shorter, but infinitely clearer.



...again, "more maybe I'm just delusional"...


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.