POV-Ray : Newsgroups : povray.off-topic : I'm in the mood for monads : Living in a box Server Time
29 Jul 2024 04:16:16 EDT (-0400)
  Living in a box  
From: Orchid Win7 v1
Date: 21 Apr 2012 17:37:48
Message: <4f9328ac@news.povray.org>
Just for giggles, what would monads look like in C++?

Well, let's pretend for a minute that C++ has a Function<X, Y> type, 
which represents a function that takes an X as input, and produces a Y 
as output. (Yes, I realise that isn't the real syntax C++ uses. Just 
pretend, for a minute.)

Further, let's pretend that you can create an anonymous function just by 
writing

   int function(int x) {return 2*x;}

Again, this obviously doesn't actually work in C++, but it makes the 
example code much easier to write.

(Actually, C++ is one of only a few languages where you probably /can/ 
implement monads "for real". But this made-up syntax makes it easier to 
quickly explain what's happening.)

Listing #1:
   Type1 x = foo();
   Type2 y = bar(x);
   Type3 z = baz(y);
   return z;

This code needs little explanation. You can write the same thing in 
Haskell as

Listing #2:
   let
     x = foo
     y = bar x
     z = baz y
   in z

Now consider this seemingly identical code:

Listing #3:
   do
     x <- foo
     y <- bar x
     z <- baz y
     return z

It seems like listing #2 and listing #3 should both be identical, right? 
Well, no. If I translate listing #3 into C++, you'll see it's 
drastically different:

Listing #4:
   Box<Type1> box_x = foo();
   return Bind(box_x,
     Box<Type3> function(Type1 x)
     {
       Box<Type2> box_y = bar(x);
       return Bind(box_y,
         Box<Type3> function(Type2 y)
         {
           Box<Type3> box_z = baz(y);
           return Bind(box_z,
             Box<Type3> function(Type3 z) {return Return(z);}
           );
         }
       );
     }
   );

Clearly, this /looks/ totally different. If you can actually /follow/ it 
all, you'll see that every statement is now followed by a call to this 
Bind() function, taking the rest of the code block as an argument. 
Important things to notice:

* The return type of foo() has changed from Type1 to Box<Type1>. This 
means that the implementation of foo() has to be changed.

* The first argument to Bind() is a Box<Type1>, but the input to its 
second argument is a Type1.

The Box<> class "forms a monad" if and only if it provides the following 
functions:

   template <typename X>
   Box<X> Return(X);

   template <typename X, typename Y>
   Box<Y> Bind(Box<X>, function<X, Box<Y>>);

(In the interests of good OO design, these should probably be /class 
methods/ rather than stand-alone functions. After all, their 
implementation is completely different for each class that behaves as a 
monad. But I will keep them as normal functions for now...)

Notice how what the Return() function does is NOTHING LIKE what a return 
statement usually means. (!) Arguably this is a rather bad naming 
choice. [You'll also notice how, in this example, Return() is completely 
redundant. Taking it out wouldn't change a thing. I only put it there 
for completeness.]

It's not hard to figure out how to implement this:

   template <typename X>
   class Box
   {
     private:
       X Datum;
     public:
       Box(X in) {Datum = in;}
       X GetDatum() {return Datum;}
   };

   template <typename X>
   Box<X> Return(X in) {return Box<X>(in);}

   template <typename X, typename Y>
   Box<Y> Bind(Box<X> in, Function<X, Box<Y>> next)
   {return next(in.GetDatum());}

(I'm guessing my syntax is wonky - even admitting that we're using 
pretend language extensions. It probably also contains at least a dozen 
incorrect or at least dubious characteristics. But the gist of it should 
be clear.)

This is a monad, but it doesn't "do" anything particularly special. With 
Box implemented this way, Listing #4 now does exactly the same thing as 
Listing #1, but 8x more complicated. Why bother?

This implementation makes Box the "identity monad". Like the identity 
function, the identity monad does... nothing. (And, like the identity 
function, this sounds like a very useless thing, but actually isn't.)

So let's try another implementation:

   template <typename X>
   class Box
   {
     private:
       std::vector<X> Data;
     public:
       Box(std::vector<X> in) {Data = in;}
       std::vector<X> GetData() {return Data;}
   };

   template <typename X>
   Box<X> Return(X in)
   {
     std::vector<X> out;
     out.push_back(in);
     return out;
   }

   template <typename X, typename Y>
   Box<Y> Bind(Box<X> box_in, Function<X, Box<Y>> next)
   {
     std::vector<X> in = box_in.GetData();
     std::vector<Y> out, temp;
     std::vector<X>::iterator iti;
     std::vector<Y>::iterator ito;
     for (iti = in.begin(); iti < in.end(); iti++)
     {
       Box<Y> box_out = next(*it);
       temp = box_out.GetData();
       for (ito = temp.begin(); ito < temp.end(); ito++)
       out.push_back(ito*);
     }
     return Box<Y>(out);
   }

Now Box<X> contains a std::vector<X> instead of just one X value. The 
Return() function now generates a 1-element vector. And the Bind() 
function now executes next() MULTIPLE TIMES, once for each element in 
the input. It then collects all the outputs from each call, and gathers 
them into a single giant vector.

(Incidentally, I couldn't figure out a direct way to just append the 
entire contents of one vector to another. Obviously there /is/ a way, 
and I'm just not looking hard enough...)

Notice how the code for foo(), bar() and baz() HAS NOT CHANGED, and the 
code that calls them HAS NOT CHANGED. And yet, now each function can 
potentially generate multiple results!

(Of course, for foo() to produce more than one output given a single 
input, then either foo() itself or some function it calls would have to 
be modified to actually make USE of this feature. But, for example, some 
function that foo() calls might change, while foo() itself remains 
unaffected, and foo() could still use this new feature of having 
multiple results.)

This is what Haskellers call the "list monad", basically. (With the 
added twist that Haskell lists are lazy, so it's possible for functions 
to produce "infinity results".)

It shouldn't be too hard to see that you could alter [the first] Box 
implementation such that it contains a success marker, and that Bind() 
only runs the next step in the chain if the previous one succeeded. You 
could also modify it to contain an error description in the case of 
failure. Then, if bar() decides to return an error-in-a-Box rather than 
a value-in-a-Box, baz() gets completely skipped. But baz() ITSELF 
doesn't need to do any explicit error checking. Bind() does that for you.

In this way, baz() returning an error-in-a-Box aborts the entire Box 
computation. A bit like... THROWING AN EXCEPTION. Except it's a library, 
not a language feature, so you can change the design of it at will. 
(E.g., make it so you have to throw three exceptions before the code 
aborts.) This is what Haskellers call the "failure monad", "error monad" 
or "exception monad".

You can also design Box to store various other auxiliary information as 
it goes along. You could have a table of data, for example, and provide 
commands to replace that table with a new one. This is how Haskell 
simulates destructive updates without ACTUALLY doing destructive updates.

(This is the difference between the State monad and the ST monad. The 
former allows you to simulate destructive updates, in the way described. 
The latter ACTUALLY DOES destructive updates, in the manner of C++ or 
Java or whatever.)

It's not hard to see how a Box could contain a string buffer and a 
character index, and you could write functions that take a Box and 
return a new Box after reading data from the string buffer. This is how 
you write parser monads.

I could continue at length, but by now you probably get the picture. 
User-defined code invisible inserted between the visible statements you 
actually wrote leads to all sorts of interesting possibilities.



Incidentally, since C++ REALLY DOES have a function type (or at least, 
function POINTER type) which can be passed as an argument, it's one of 
the few languages where it ought to be possible to implement a monad 
"for real". I might try it if I get the time.

The problem, of course, is that it's going to be so hellishly verbose 
that no sane person would ever, EVER write their C++ code like this...

Saying that, I hear C++ lets you override operator(). So perhaps the 
simplest way to describe a monad is as "override operator;". ;-)


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.