|
 |
On 11/11/2010 08:20 PM, Darren New wrote:
> Invisible wrote:
>>>>> (That is, optional sign, one or more digits, optional decimal point
>>>>> followed by one or more digits, optional E followed by optional sign
>>>>> followed by one to three digits.)
>>>>
>>>> If I've understood the spec correctly, it's
>>>>
>>>> do
>>>> option (char '+' <|> char '-')
>>>> many1 digit
>>>> option (char '.')
>>>> many1 digit
>>>> option (do char 'E'; option (char '+' <|> char '-'); many1 digit)
>>>
>>> Bzzzt. Sorry. That requires at least a 2-digit number.
>>
>> I don't follow. Where do you think by definition deviates from your spec?
>
> Your parser won't parse "+3" as a valid number.
Huh? Oh wait, I didn't see the "optional" word before the decimal point.
So assuming I'm guessing the bracketing correctly, you want
do
option (char '+' <|> char '-')
many1 digit
option (do char '.'; many1 digit)
option (do char 'E'; option (char '+' <|> char '-'); many1 digit)
>> Sure. Except without any guarantee of syntactic correctness.
>
> Again, until the first time you run it. So?
If you don't see why this is critically important, I'm not sure what
else I can say to that...
> Clearly you have no guarantee of semantic correctness, so this really
> isn't a problem, given that with your syntactic correctness you still
> got the expression wrong. :-)
Semantic correctness is impossible to guarantee.
>> Not every string is a valid regex. You have to follow the syntax
>> rules. But this is not checked at compile-time (and usually /cannot/
>> be checked at compile-time). So if you make a typo in your regex, you
>> won't find out until runtime.
>
> It can be checked at compile time if your language cares to. That's not
> an aspect of the regexp, but an aspect of you using Haskell.
I've yet to find a Haskell regex library that provides compile-time
guarantees either. After all, it's just a flat string with no structure.
> And if you wrote the code and never ran it, then guess what, you have
> worse problems than syntax errors in your regexps.
That's like saying "type checking is pointless, because you have to test
your code anyway". Yes, you /do/ have to test your code. But if the
computer can find the easy bugs instantly without you having to wait for
the entire test suite to run, you get stuff done a lot quicker.
>> If you start constructing regexs programmatically, your problems just
>> multiplied.
>>
>> On the other hand, with a /real/ parser library, both of these grave
>> problems immediately vanish into thin air.
>
> How do you construct your parser programatically in Haskell? That looks
> like code to me.
A parser is just a data structure. For example, "digit" is a parser, and
when you say "many1 digit", you're calling the "many1" function, passing
it the "digit" parser as argument, and getting a new parser as the
result. You just programmatically constructed a parser.
Obviously that's a fairly trivial example, but you're manipulating
parser objects with a Turing-complete programming language. You get the
idea.
Of course, you can manipulate strings-that-happen-to-be-regexs in a
similar way. It's just that there's no checking of any kind that what
you're doing with them is sensible. It's a bit like trying to build an
XML file by string manipulation. Yes, it can be done. Yes, you can make
it work. And yes, it's incredibly fragile. This is why almost everybody
uses a real XML library that guarantees that the resulting XML document
will be at least well-formed, if nothing else.
Post a reply to this message
|
 |