|
 |
Invisible wrote:
>>>>> Haskell really is a PITA to parse. I mean, think about it:
>>>>> - Whitespace is significant. (No context-free parsing here!)
>>>>
>>>> I don't think "context-free" means what you think it means.
>>>> Whitespace being significant isn't "context".
>>>
>>> Indentation is significant. To parse the next line, you must know how
>>> far the current line was indented. Hence, "context".
>>
>> I don't think "context-free" means what you think it means.
>>
>> That's like saying Pascal isn't context-free because you need to know
>> how many "begin" statements you're nested within.
>
> No, you don't. You can totally parse the contents without knowing how
> deeply nested it is. The parsing rules don't change.
Of course it does.
if x = 0 then
y = 1;
y = 2;
parses completely different from
if x = 0 then begin
y = 1;
y = 2;
end
> However, in Haskell, if I write
>
> foo = if x
> then y
> else z
>
> that's different to
>
> foo =
> if x
> then y
> else z
>
> To parse the "then" part, you *must* know how far the "if" part is
> indented.
You only need to know if it's indented less, the same, or more than foo,
right? It would be the same if the "if" were under the = as if it were
under the "o", yes?
In which case you have three tokens. indent, outdent, and dent.
> It's not possible to unambiguously parse without this
> information. (Consider nested if-expressions; here indentation is the
> only thing that makes the nesting unambiguous...)
That doesn't mean it's context sensitive. That just means whitespace is
syntactically meaningful.
>> A grammar is context-free if none of the left side of the production
>> have terminal symbols in them. (That's the technical definition,
>> which takes several paragraphs to explain in prose.)
>
> Yeah. You'd have to know what a "production" is, for starters...
You know some BNF, right?
<expression> ::= <term> * <term> | <term> / <term>
Well, if you have non-terminals on the left of the ::=, you have a context
sensitive grammar.
>> Could be! :-) Sounds like a real bear to compile, but then, that's
>> what computers are for.
>
> Well, you can totally tokenise the input without knowing anything about
> what user-defined operators exist. (There are rules about what
> characters you may use.) But it seems that you can't build a correct
> parse tree until you know the operator precedences, so...
And you can find the definitions without knowing the precidences, right? You
can distinguish between definitions that use user-defined operators and
those which don't?
--
Darren New, San Diego CA, USA (PST)
"We'd like you to back-port all the changes in 2.0
back to version 1.0."
"We've done that already. We call it 2.0."
Post a reply to this message
|
 |