|  |  | > OK, so how do you do
>
> (\+|-)[0-9]+(\.[0-9]+)?(E(\+|-)?[0-9]{1,3})?
>
> in your parser language?
OK, that's one big ol' complex regex, right there.
> (That is, optional sign, one or more digits, optional decimal point
> followed by one or more digits, optional E followed by optional sign
> followed by one to three digits.)
If I've understood the spec correctly, it's
   do
     option (char '+' <|> char '-')
     many1 digit
     option (char '.')
     many1 digit
     option (do char 'E'; option (char '+' <|> char '-'); many1 digit)
Enforcing that the exponent is less than or equal to 3 digits would be 
slightly more wordy. The obvious way is
   xs <- many1 digit
   if length xs > 3 then fail else return ()
Notice that since this is written in a /real/ programming language and 
not a text string, we can do
   sign = char '+' <|> char '-'
   number = do
     option sign
     many1 digit
     option (char '.')
     many1 digit
     option (do char 'E'; option sign; many1 digit)
and save a little typing. You can also factor the task into smaller pieces:
   sign = char '+' <|> char '-'
   exponent = do
     char 'E'
     option sign
     xs <- many1 digit
     if length xs > 3 then fail else return ()
   number = do
     option sign
     many1 digit
     option (char '.')
     many1 digit
     option exponent
You can also do things like write a function that builds a special kind 
of parser given a simpler spec.
With a regex, on the other hand, you cannot even statically guarantee 
that a given string is even a syntactically valid regex. And that's 
before you try to programmatically construct new ones. :-P
Post a reply to this message
 |  |