|
|
OK, so just for giggles, I tried to implement monads in C#.
The short version: It appears that I can create a class with forms a
monad, however it appears to be impossible to define the concept of
"being a monad" such that you can write code which polymorphically works
for any monad.
The long version:
It appears that C# generics is too weak to express the necessary type
relationships that describe what a monad is. I suspect C++ templates
probably can do this, but C# can't. This rules out the possibility of
writing an abstract class or an interface which describes all possible
monads, and then writing code against that interface.
On the other hand, it appears to be quite possible to write code that
works on one *specific* monad, decided at development-time.
OK, so let me recap what the heck a monad is. It's so abstract that it's
awkward to put into works, but it's not particularly complicated. If the
class Foo forms a monad, that basically means that an object of type
Foo<int> can somehow "supply" an integer. Now there are several possible
ways it might do that.
The simplest way is if Foo is some kind of data container. For example,
a List<int> can "supply" integers in that it *contains* integers. So
many sorts of container form a monad. Another way is if Foo represents
some kind of "process". For example, Foo<int> might be some kind of
parser object, which reads some text and parses an integer out of it.
Most of these "process monads" have some kind of a Run() method that
actually "executes" the action(s) in question. (The result might not be
a simple int, of course; for example, a parser could return a parse
error rather than an int, depending on the input supplied.)
Stepping back from that, if Foo<int> is simply an object that represents
a supple of integers, then the Foo class forms a monad if we can define
the following functions:
public static Foo<T1> Inject(T1);
public static Foo<T2> Map(Foo<T1>, Func<T1, T2>);
public static Foo<T1> Join(Foo<Foo<T1>>);
public static Foo<T2> Chain(Foo<T1>, Foo<T2>);
public static Foo<T2> Bind(Foo<T1>, Func<T1, Foo<T2>>);
[This isn't quite valid C# syntax, but the meaning should be clear.]
This set is redundant; several of these functions can be defined in
terms of each other. There are two possible minimal subsets:
- Inject() + Map() + Join()
- Inject() + Bind()
So what do all these functions do, then?
Inject() is simple enough. It takes a thing and constructs an object
which produces that thing. For example, if Foo is a parser, then
Inject(5) is a parser which consumes zero characters of input and just
returns 5. It *always* returns 5, regardless of what input you run the
parser on. (Since it ignores input anyway.) If Foo is a data container,
then Inject(5) is just a 1-element container with a 5 in it.
The Map() function is simple enough. It takes a function and applies it
to every element of a collection. Except that a monad isn't necessarily
a collection. For example, I might have some parser that returns a
single digit character, and a function that converts a digit character
into an actual integer. Using Map(), I end up with a parser that returns
integers not characters. (Notice that we still haven't *run* any
parsers, just constructed a more complex parser out of a simpler one.)
If you use Map() to apply a function which produces another Foo, then
clearly you're going to end up with a Foo<Foo<X>>. The Join() function
unwraps a layer of wrapping. How this works depends on what the monad is
(which is why it's definition is crucial). For a container, it's
obvious; think about flattening a list of lists into a flat list. For
parsers, it's taking a parser that produces a parser, and then running
the inner parser to return the result of that. (It sounds more
complicated than it is.)
The Bind() function actually takes a function that produces a Foo, in
the way just described. You can think of it as taking a parser and a
function which consults that parser's output and decides what parser to
run next, returning that parser so that Bind() can run it. From the
previous paragraph, it should be obvious that Bind() = Map() + Join().
It may be less obvious that Map() = Bind() + Inject() and Join() =
Bind() + the identity function. So either can be defined in terms of the
other.
The Chain() function takes two parsers, runs the first one, and then
runs the second one. It discards the output from the first and returns
the output from the second. This is the clue to how to implement it;
it's simply Bind(), given a function which ignores its input and always
returns the same output.
So anyway, we can define Foo<> and then write the above functions, and
then Foo<> can be used as a monad. But can we write an abstract
specification of "monad" and have multiple classes implement it? Sadly
not, as best as I can tell.
One might naively try this:
public static Monad<T1> Inject(T1);
public static Monad<T2> Bind(Monad<T1>, Func<T1, Monad<T2>>)
The first method is fine. But the second doesn't work at all! The
*intent* is that "Monad" should refer to *the same class* at all times.
But that isn't how object-oriented class inheritance works. The Bind()
method is implemented specially for each type of monad. It has to "look
into" the internal structure of both monad objects - which isn't
possible if they aren't actually both of the necessary class.
Trying to use Map() and Join() instead is no better:
public static Monad<T2> Map(Monad<T1>, Func<T1, T2>);
public static Monad<T1> Join(Monad<Monad<T1>>);
Map() isn't too bad - although since its implementation changes for each
sort of monad, this really ought to be an instance method rather than
static. But Join() is again broken. As written, it will accept
List<Parser<int>> and Parser<List<int>>, whereas we want it to accept
only List<List<int>> or Parser<Parser<int>>.
If we refer to a *specific* monad class, these problems go away. But if
we try to be abstract, there's no way to enforce the required
constraints. The only option is run-time dynamic casting - yuck!
Hmm, all those static methods though... that doesn't sound very OO, does
it? Well, Inject() looks very much like a "factory", so perhaps we can
have a FooMonad class that contains a (non-static) Inject(), and the
move Bind() et al to Foo itself?
public sealed class FooMonad
{
public Foo<T1> Inject<T1>(T1 datum) {...}
}
public sealed class Foo<T>
{
public Foo<T2> Map<T2>(Func<T, T2> f) {...}
public Foo<T2> Bind<T2>(Func<T, Foo<T2>> f) {...}
}
Even here, trying to write a type signature for Join() is problematic.
Basically Join() takes "this" and returns a Foo<T2> - but only if
T=Foo<T2>. How do you say that in a type signature? The only real way
around this is to artificially stuff Join() back into FooMonad.
Post a reply to this message
|
|
|
|
On 14/03/2013 08:24 PM, Warp wrote:
> Orchid Win7 v1<voi### [at] devnull> wrote:
>> Again, the problem is that we need to write a method which examines TWO
>> objects which must be of THE SAME class.
>
> I don't understand at all what that even means.
>
> Start by explaining (briefly) what "examines" means in this context.
Consider another example: Let's say you want to do something like
public abstract class Tree
{
...
public abstract Tree Union(Tree other);
...
}
So we can now write SplayTree, HeapTree, SearchTree, etc as subclasses
of Tree. But the thing is... you can only Union() two trees together if
they are THE SAME TYPE of tree. So you can take the union of two
SplayTrees, or the union of two HeapTrees, but NOT the union of a
SplayTree and a HeapTree.
But the type signature above does not express this constraint. It says
that you can union *any* two objects so long as they are both Trees -
which is wrong.
That's the basic problem with writing a monad interface in C# (or Java,
for that matter). You could do a run-time type check and throw an
exception if the types don't match, of course. But I haven't yet figured
out how to enforce the constraint at compile-time.
(Having asked some questions on stack overflow, it appears it might
actually be possible in C# though... I'll give it a go tomorrow.)
Post a reply to this message
|
|