|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
OK, so I know nobody actually cares or anything, but I just "finished"
work on a program that takes a Haskell expression, plus the type
signatures for any free variables mentioned, and figures out what the
type of the expression and all its subexpressions is.
Sounds pretty easy, right? Well let me tell you, it's taken me the
entire week to get it to work properly!
The final program is not especially large or complex, but *getting* to
this point has been a long hard journey. Taking a nontrivial program and
rearranging it while you try out different designs is quite a lot of
work. If I had known in my head exactly how I was going to code this
before I started typing, I could have finished it significantly sooner!
The first major task is to write an expression parser. This is
*drastically* less trivial than it sounds! (Currently my parser
understands only a fraction of Haskell's full syntax. The bare minimum
to parse useful expressions.) And the expression trees are so verbose
that you really *need* a pretty-printer too. Ideally with propper
bracket handling. (You have *no idea* how damned long it took to get the
brackets to work correctly...)
Next, you quickly find that various program passes manfunction if
variable names are not unique. Like, if somebody writes "(\x -> x) (\x
-> x + 1)", those are two different, unrelated local variables which
both just happen to be called "x". So you need to somehow make all the
variable names unique, while correctly respecting the language scoping
rules. This is surprisingly easy to do wrong. (And maddeningly tricky to
track down once you notice it's wrong - if you notice at all!)
Then you're going to need a type unification engine. The basic algorithm
is simple, but there are a large number of minor practical details which
all have to be implemented correctly or you get gibberish. Good luck
testing it! Debugging is even more fun...
Next you need to take the expression and decompose it into a set of
subexpressions and note down their interrelationships. From these, you
can compute a set of type relations which the unification engine then
applies. If unification succeeds, you get a list of type signatures for
every subexpression. If unification fails, the expression is ill-typed.
Or your type checker is buggy!
Ah, but wait! The exact way you decompose the expression and apply the
unification process dictate what result you get. In the process of
crawling through this stuff, I discovered things I never even realised
about Haskell's type system. For example...
(foo 1, foo 'x')
If "foo" is a top-level variable, this expression might be well-typed.
("Might" depending on what type signature foo has, obviously.) However,
if "foo" is a local variable, then the above expression is *guaranteed*
to be ill-typed.
Stop and think about that for a moment: Where a variable is bound
determins whether this expression is well-typed or not. That should tell
you that something pretty subtle is going on here!
I never realised this, but look:
\ foo -> (foo 1, foo 'x')
There is no type which can be assigned to "foo" which will make this
expression acceptable. However, consider this:
foo :: x -> x
bar = (foo 1, foo 'x')
This *is* well-typed. It seems that for each invocation of "foo", the
"x" type variable is allowed to take on a different value. However, in
\ foo -> (foo 1)
the expression's type is going to be "(Int -> y) -> y". Basically here
"foo" is required to have one specific type, and it must be the same
each time it's used.
There's actually a non-standard extension that allows you to write
(\ foo -> (foo 1, foo 'x')) :: forall x. (x -> y) -> (y, y)
The above will now type-check. I think what the magical "forall" keyword
is actually doing is generating a new type variable inside the type
checker each time "foo" is mentioned...
Well, there we are. Now I'm off to go implement the rest of my
program... (Such as, say, making it understand more syntax!)
--
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Orchid XP v8 wrote:
> Ah, but wait! The exact way you decompose the expression and apply the
> unification process dictate what result you get. In the process of
> crawling through this stuff, I discovered things I never even realised
> about Haskell's type system. For example...
>
> (foo 1, foo 'x')
>
> If "foo" is a top-level variable, this expression might be well-typed.
> ("Might" depending on what type signature foo has, obviously.) However,
> if "foo" is a local variable, then the above expression is *guaranteed*
> to be ill-typed.
>
> Stop and think about that for a moment: Where a variable is bound
> determins whether this expression is well-typed or not. That should tell
> you that something pretty subtle is going on here!
>
> I never realised this, but look:
Awesome. On further investigation, it turns out my understanding of
Haskell's type system is actually incorrect. And, unsurprisingly, my
program is incorrect too; it rejects expressions that it should actually
accept.
Nice to know I understand my own favourit programming language, eh? :-}
--
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Orchid XP v8 wrote:
> variable names unique, while correctly respecting the language scoping
> rules.
Only surprising if you haven't written LISP code. :-)
Indeed, one of the big improvements in LISP was getting this right (i.e.,
lexical scoping rather than textual scoping, or some such terminology). And
it's still a killer in macros.
Still sounds easier to debug than crashing kernel code on a machine with
insufficient memory to host a debugger or compiler. :-)
--
Darren New, San Diego CA, USA (PST)
"We'd like you to back-port all the changes in 2.0
back to version 1.0."
"We've done that already. We call it 2.0."
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
>> variable names unique, while correctly respecting the language scoping
>> rules.
>
> Only surprising if you haven't written LISP code. :-)
...or a lambda calculus interpretter. :-P
> Still sounds easier to debug than crashing kernel code on a machine with
> insufficient memory to host a debugger or compiler. :-)
Probably, yes.
Haskell really is a PITA to parse. I mean, think about it:
- Whitespace is significant. (No context-free parsing here!)
- There are arbitrary user-defined operators, which arbitrary precedence
and associativity.
Mercifully, there are few language keywords. (There's a lot more than
you think, when you sit down and catelogue them, but it's still nothing
compared to, say, Pascal or C.)
Even so, when a user can define an operator like *&*#*|* and set it to
any precedence or associativity, it's not so easy to produce a parse tree...
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Orchid XP v8 wrote:
> Awesome. On further investigation, it turns out my understanding of
> Haskell's type system is actually incorrect. And, unsurprisingly, my
> program is incorrect too; it rejects expressions that it should actually
> accept.
Oh look, somebody has produced a helpful T-shirt explaining it all:
http://www.cafepress.com/skicalc.6225368
Isn't that nice?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Orchid XP v8 wrote:
> The first major task is to write an expression parser. This is
> *drastically* less trivial than it sounds! (Currently my parser
> understands only a fraction of Haskell's full syntax. The bare minimum
> to parse useful expressions.)
Oh *man*... The more I read into this, the more I discover that Haskell
is way, way more flexible than I thought! o_O
For example, you're allowed to give fixity declarations for local
variables. I thought it was only permitted for top-level variables, but
no, apparently every let-expression can contain fixity declarations and
type signatures. So much for a let-binding just being a list of
variables and what they're bound to. :-P
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
I just implemented a program entitled "Test-Sanity". Arguably, the test
has failed. :-P
(I implemented an internal sanity check routine, and this program runs
it to see what output it produces. But maybe it's creator's sanity is
questionable...)
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Invisible wrote:
> I just implemented a program entitled "Test-Sanity". Arguably, the test
> has failed. :-P
>
> (I implemented an internal sanity check routine, and this program runs
> it to see what output it produces. But maybe it's creator's sanity is
> questionable...)
Ah, excellent. My sanity checker has a bug.
(Being a moron, I thought that one could determine whether a set of sets
are all disjoint merely by taking the intersection of all of them. DUH!)
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Invisible wrote:
> Haskell really is a PITA to parse. I mean, think about it:
> - Whitespace is significant. (No context-free parsing here!)
I don't think "context-free" means what you think it means. Whitespace being
significant isn't "context".
> - There are arbitrary user-defined operators, which arbitrary precedence
> and associativity.
But they have to be defined by the user before you use them, so that doesn't
seem hard. Most parsers with more than half a dozen operators (I'm looking
at you, C) have tables defining the precedence and associativity anyway.
> compared to, say, Pascal or C.)
Try Ada.
> Even so, when a user can define an operator like *&*#*|* and set it to
> any precedence or associativity, it's not so easy to produce a parse
> tree...
I think maybe you're doing it wrong.
--
Darren New, San Diego CA, USA (PST)
"We'd like you to back-port all the changes in 2.0
back to version 1.0."
"We've done that already. We call it 2.0."
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
>> Haskell really is a PITA to parse. I mean, think about it:
>> - Whitespace is significant. (No context-free parsing here!)
>
> I don't think "context-free" means what you think it means. Whitespace
> being significant isn't "context".
Indentation is significant. To parse the next line, you must know how
far the current line was indented. Hence, "context".
>> - There are arbitrary user-defined operators, which arbitrary
>> precedence and associativity.
>
> But they have to be defined by the user before you use them, so that
> doesn't seem hard.
No, they can be used *before* they're defined. And they can be defined
in a completely seperate source file too. Which makes me wonder how it's
possible to parse an individual file at all... you'd need to potentially
parse every module it imports to work out the operator precedences!
>> Even so, when a user can define an operator like *&*#*|* and set it to
>> any precedence or associativity, it's not so easy to produce a parse
>> tree...
>
> I think maybe you're doing it wrong.
Heh, *maybe*?
Hey, it's *me* we're talking about there. Of *course* I'm doing to wrong!
--
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
|
|