|
![](/i/fill.gif) |
One Haskell VFAQ is "What is a monad?"
Well, that's kind of like asking "What is a number?" Imagine you're in
the steamy jungles of some distant land, when you meet a wild tribe of
natives, who you are inexplicably able to communicate with. Suppose
these people don't know anything at all about numbers, and one of them
asks you "what is a number?"
Now, we use numbers all the time, right? *Everybody* knows what a number
is. But if you meet somebody who has never encountered this concept,
where would you even *start*? How can you explain what a number is? How
can you even begin to explain why that's useful?
Now try explaining what a number is *briefly*.
Hard, isn't it? You either end up saying something so abstract that it's
utterly cryptic, or you end up writing a huge long complex-sounding
example. Either way, numbers sound like an amazingly complicated idea...
but they aren't.
A similar thing can be said about monads. A quick Google search will
reveal an entire cottage industry of people trying to explain them. If
anything, this makes it seems like they *are* hard to understand. But
really, they aren't. It's just that they're not "like" anything else
familiar that I can easily point at. They are a really abstract concept,
which makes them kind of hard to grasp.
The short version: A monad is one specific way of chaining operations
together.
When all the fuss and the hype and the endless tutorials are over,
that's all monads really come down to. They're just a particular way of
linking operations together.
(Some people incorrectly think that a monad is for /sequencing/
operations. This is not correct. They /can/ be used for that, but that's
not what they're "for", as such. Other people think that monads are for
managing mutable state. Again, you /can/ use them to do that, but that's
not all they do. It isn't their fundamental purpose.)
The medium version:
A monad is something that has a "bind function" which links calculation
steps together. (In Haskell it's named ">>=", but I'm going to call it
"the bind function", because that's a damned-site easier to read.) All
the bind function actually does is to extract the result from the
previous step, and feed it into the next step.
Pretty simple, eh?
Now, if the bind function /just/ does that, the result isn't very
interesting. In fact, you could achieve exactly the same thing yourself
without bothering with all this monad malarkey. And that brings me to a
very important point:
Merely BEING A MONAD isn't especially useful.
Every USEFUL monad does something else IN ADDITION to just being a monad.
(A monad that "does nothing" is called the identity monad. Rather like
the identity function, this /sounds/ completely pointless, yet turns out
not to be...)
In "normal code", you might call one function, pass its result to
another function, and then pass the result of that to a third function.
In "monadic code", instead of each function returning a "result", it
returns some sort of construct which /contains/ a result. And between
each pair of functions, you insert this bind function, which fetches the
result out of one construct and squirts it into the next function.
In Haskell, you can explicitly write out the bind function, such as
foo x >>= bar >>= baz
Or you can write a so-called "do block", such as
do
y <- foo x
z <- bar y
baz z
Here the compiler automatically inserts all the ">>=" calls for you
behind the scenes. So the two examples are actually identical. All a
do-block does is insert these extra function calls for you, so you don't
have to do it manually.
Like I said, if the bind function just passed the result along, nothing
especially interesting would happen. The *interesting* part is that you
can write a bind function so that it does hoopy processing between
statements. That's where the fun begins.
The long version:
As you know, in Haskell all variables are immutable. For the moment,
we'll pretend that all data structures are immutable too. Basically,
once you create something, it becomes read-only.
Suppose I'm writing a "solve" function, which takes an equation, and
returns a bool indicating whether the function can be solved or not.
solve :: Equation -> Bool
Suppose that I've got two equations, and I want to check whether /both/
of them are solvable. That's pretty easy:
(solve eq1) & (solve eq2)
Suppose that in addition to the equation to solve, you also need to pass
some kind of table containing the equation's context (e.g., the
definitions of any variables it mentions, or whatever). That's still
pretty easy:
solve :: State -> Equation -> Bool
(solve s eq1) & (solve s eq2)
[Assuming that "s" holds whatever state it is that the "solve" function
requires as input.]
So far, so good. Now assume that the "solve" function /changes/ the
state as it processes each equation... Ah. OK, now we've got a bit of a
problem.
Basically, the way to handle this is that the "solve" function needs to
return the new, updated state back to the caller, /as well as/ the
actual result it's trying to calculate.
solve :: State -> Equation -> (State, Bool)
The function where I'm trying to solve both equations also presumably
needs to return the latest version of the state to its own caller, so we
end up writing something like this:
let
(s1, b1) = solve s0 eq1
(s2, b2) = solve s2 eq2
in (s2, b1 & b2)
On the one hand, this is all simple, straight-forward code, with nothing
particularly complicated happening.
On the other hand... yuck! Our simple 1-line expression has degenerated
into 4 lines of code. And imagine if I wanted to make a dozen calls to
"solve"; I'd end up with a dozen "s"-variables. And in each function
call, I need to make sure I use the correct "s"-variable. You just
/know/ that at some point I'm going to make a typo, or copy-paste
something from somewhere else and forget to tweak the variable numbers...
In short, this is a mess. (I can almost hear the Java programmers at the
back of the room muttering "told you so".) Surely we can do better.
Well, it turns out we can. We can write a monad where the bind function
feeds the /result/ from one function to the next, but /also/
automatically feeds the latest version of the state across too. So now
our functions don't need to /explicitly/ take the state as a parameter,
and don't need to /explicitly/ return it as a result. The monad hides
the plumbing. With that done, I can now write
do
b1 <- solve eq1
b2 <- solve eq2
return (b1 & b2)
It's still three lines instead of one. But at least I don't have lots of
little "s"-variables scattered all over the place now. I can't possibly
muddle them up and introduce strange bugs because my code is looking at
an out-of-date version of the state. The bind function takes care of
passing along the correct version of the state at each step.
It's still more text than the original. But notice, if "solve"
*modifies* the state, then it potentially *matters* which call happens
first - whether we solve eq1 first or eq2 first. The original
expression, where the state doesn't get altered, didn't specify an
order. Heck, you could solve both equations /in parallel/ if you want.
But if solving the second equation /depends on/ information gleaned from
solving the first one, then it is no longer logically possible to do
that. So the code above is longer because it specifies an execution order.
Right now, I can just hear the Java guys sneering at me.
"So let me get this straight. You had to modify your compiler to accept
a special new syntax which you then convert into a bunch of extra calls
to this complicated new function you wrote... just so you can have
mutable state?... The same mutable state that every REAL programming
language automatically supports out-of-the-box? HAHAHA! Haskell sucks, man!"
In Java of course, you can simply do
bool b1 = solve(s, eq1);
bool b2 = solve(s, eq2);
return (b1 && b2);
and be done with it. Since Java passes everything by reference, solve()
is quite capable of altering "s" just as it is. It doesn't need any
special "monad" to do that. Heck, Java will even let you do
return solve(s, eq1) && solve(s, eq2);
although now the execution order (I believe) isn't guaranteed any more.
But hey, that's a hell of a lot simpler and easier, right? You know,
using ACTUAL MUTABLE DATA to implement mutable data. :-P
Well, stick this in your pipe and smoke it: Suppose that "solve" can
return MULTIPLE SOLUTIONS. What now? Well, every time you call solve(),
you're going to have to loop over all of the solutions it returns,
right? So the Java code is now going to look something like
Solution[] s = solve(in, eq1);
ArrayList<Solution> out = new ArrayList<Solution>();
for (int lp=0; lp<s.length; lp++)
out.addAll(solve(s[lp], eq2));
return out;
Clear as mud, right?
And the Haskell version...? Well, you know what? If you put the
foreach-loop into the bind function itself, the Haskell code remains
EXACTLY THE SAME. All of the existing code still compiles, still runs,
AND DOES THE RIGHT THING. Boom, head shot.
If that doesn't sound like a big deal, try cascading a dozen solve()
calls in Java. Not funny any more, is it? Manually writing a brand new
loop for every individual call. Oh, you could probably do something with
the reflection API to allow you to write the code containing the
foreach-loop just once, and pass it the name of the function you're
trying to call next. But it's still a fair bit of work.
The Haskell code, on the other hand, remains unchanged. Because "going
from step A to step B" has been abstracted out into this "bind
function", we can change how it happens in just one place - the bind
function! So all our code still works the same, we just need to change
the implementation of the monad.
Here's an interesting thing: We can now represent an unsolvable equation
by just RETURNING ZERO SOLUTIONS. You know what happens if you iterate
over an empty container? It means you run the loop body ZERO TIMES.
Running something zero times means NOT RUNNING IT.
In other words, if one of the solve() calls returns zero solutions, the
remaining calls get AUTOMATICALLY SKIPPED.
We can actually extend this a bit. If we make it so the monad can hold a
failure description, we can have bind check the previous result, and run
the next step if and only if the previous one succeeded. In other words,
WE CAN IMPLEMENT EXCEPTION THROWING AS A LIBRARY.
We can also write a function which runs one block of code, checks
whether the result was a success or failure, and runs another block of
code if it was a failure. In other words, WE CAN IMPLEMENT EXCEPTION
CATCHING AS A LIBRARY.
Now C++ and Java already /have/ exception handling. But you know what?
If /you/ implement the exception handling, then /you/ can decide how it
should work.
For example, if solve() throws an exception, does that abort /all/ the
solutions being computed, or just the one that threw an exception? Well,
YOU DECIDE. You can implement it the first way, or you can implement it
the second way. Heck, you can have DIFFERENT SORTS OF EXCEPTIONS and
implement each one differently.
You can also do things like make the first four exceptions non-fatal,
but the fifth exception aborts the thread and returns all five
exceptions to the caller.
You can write a function that "decorates" exceptions as they go past,
and in this way you can ADD A CALL STACK to your exceptions.
You can make it so that there is one monad that allows you to throw
exceptions. Or you can make it so that all the types of exceptions that
can be thrown show up in the type signature. In other words, you can
implement UNCHECKED EXCEPTIONS, or you can implement CHECKED EXCEPTIONS.
You can choose which style you fancy.
It's not just exceptions either. You can write code that emits warnings,
and gather these up and log them at the end. (Why not just write the
warnings directly to a log file? Well, because then if you decide you
wanted to do something else with them, you'd have to change your solve()
code. It's better to abstract that out.) Going the opposite way, there
are parsing monads which feed a stream of INput INto a computation,
rather than collecting output from it.
Using a monad allows us to simulate destructive updates in Haskell
without actually *doing* destructive updates. People often sneer and say
"what's the point?" Well you know what? Because it doesn't /really/ do
destructive updates, it makes it trivial to implement features like...
reverting to an older version of the state, without explicitly storing
it and passing it around.
I can take my solve() code, tweak the monad a bit, and now I can insert
a "go back to an older state" command into one branch of my function,
without having to modify the whole damned thing to pass the old state
around. Try doing *that* in a language were destructive updates /really
are/ destructive updates. :-P
Writing some code that can solve an equation is one thing. Writing some
code where you can pause the program half way through and see what it's
doing is much, much more complicated.
But you know what? You can tweak the bind function to make a monad which
supports PAUSING AND RESUMING.
Literally. You can tweak the monad a little, and not alter the main
program at all, and now you can pause the main program at any point and
have a nose around inside to see what's happening. Or even alter its
internal data, and then resume it. And it's all in a library!
(Also: Holy ****, I just discovered coroutines! o_O Who knew it was this
useful?!)
Now, to be clear: Everything I have done in Haskell with my monads can
also be done in normal languages using other techniques. (They call it
"Turing completeness".) The difference, IMHO, is that monads make a lot
of this stuff DRASTICALLY EASIER TO DO than it would be in other languages.
If you have a large Java codebase and you suddenly decide that every
time you call a certain function you want to iterate over the multiple
results it returns... good luck with that. In Haskell, on the other
hand, it's pretty simple. I don't even need to alter my existing code,
because I've already factored out the sequencing.
There is one obvious potential problem: If monad X lets me throw
exceptions, and monad Y lets me produce multiple results, how the heck
do I combine those together? Well, the answer is that you can't. You can
write a NEW monad that has both of these features, but you can't combine
the two existing monads.
That sounds like a pretty epic failure of code reuse, right?
However, while you can't combine two *monads* together, what you *can*
do is write "monad transformers". A transformer takes an existing monad
and adds an extra feature to it. And this is where the identity monad
comes in: You can start with the identity monad, which does nothing. You
can implement everything else as transformers rather than monads. You
can then "stack" the monads one on top of the other to implement all the
features you want.
So, for example, you can take the identity monad, apply an exception
transformer to it, and them apply a list transformer to that. You can
now return multiple results, and throw exceptions.
Notice very carefully: If you throw an exception, it terminates ONE
THREADS OF COMPUTATION. That's because the list transformer is running
on top of the exception transformer. If you swap them around, then
throwing an exception terminates ALL THREADS OF COMPUTATION. So it
*matters* which order you stack the transformers in. Different orderings
produce different behaviour, and you need to check that you're stacking
them in the order that produces the feature set you actually want.
There are a couple of problems with transformers. First, they require
unusual Haskell features. (If you want the code to be flexible enough to
be worth making into a reusable library, you need some fairly advanced
type system features.)
Second, if you want to access a command at level 3 in the transformer
stack, you have to "lift" it from level 1 to level 2, and then from
level 2 to level 3. You end up having to remember what level each
command is at, and therefore how many times you need to "lift" it. All
of which is very tedious, and breaks horribly as soon as you change the
transformer stack.
The solution is to define classes which do the lifting automatically.
This requires yet /more/ advanced type trickery. (It also doesn't work
if you've got the same transformer at two different levels in the stack
- which is sometimes useful.) It also means a combinatorial explosion of
class instances as you define more transformers. (!)
And finally... monad transformers just make people's brains explode. So
it's not really surprising that actually, most people seem to just end
up implementing every monad they need from scratch again. It's easier
that way.
Of course, sometimes you don't need a whole monad. Sometimes just an
"applicative" is sufficient. Or sometimes you can use an "arrow". Or...
Wait! Come back!! >_<
Post a reply to this message
|
![](/i/fill.gif) |