|
|
Invisible <voi### [at] devnull> wrote:
> Anything that isn't parsable as a number is a name. Therefore,
> "0." -> real
> ".0" -> real
> "." -> name
> "1.1" -> real
> "1.1.1" -> name
> "1e1" -> real
> "1x1" -> name
> "s1" -> name
> "1s" -> name
> Will the insanity never end?? >_<
> Good luck writing a parser that can untangle all of that... :-(
I really can't see the problem. When the input contains a sequence of
valid characters (ie. which can form a real or a name), if this sequence
has the form:
^[+-]?([0-9]+\.?|[0-9]*\.[0-9]+)(e[+-]?([0-9]+\.?|[0-9]*\.[0-9]+))?$
then it's a real, else it's a name.
If we translate that regexp to plain English, it means:
- There's nothing before this pattern (which is what the ^ at the beginning
means), and nothing after it (which is what the $ at the end means).
- The sequence optionally starts with a + or a -.
- After that two possible patterns must appear (the expression in
parentheses, where the two patterns are separated with the | symbol):
- A sequence of one of more digits ([0-9]+), optionally followed by
the dot character (a plain "." has a special meaning in regexps, so the
dot character has to be escaped, and thus written as "\.")
- A sequence of zero or more digits ([0-9]*) followed by a dot character
followed by a sequence of one of more digits.
- Optionally the character "e" can follow, and if that's the case, a real
(not containing an "e") must follow as well (the whole last part in
parentheses, with the "?" at the end to indicate optionality).
In an actual BNF-style parser the rule probably becomes simpler because
the repetition can be removed.
--
- Warp
Post a reply to this message
|
|