POV-Ray : Newsgroups : povray.off-topic : Really strange design choices Server Time
6 Sep 2024 19:20:18 EDT (-0400)
  Really strange design choices (Message 18 to 27 of 37)  
<<< Previous 10 Messages Goto Latest 10 Messages Next 10 Messages >>>
From: Warp
Subject: Re: Trixy
Date: 18 Dec 2008 11:21:03
Message: <494a786f@news.povray.org>
Invisible <voi### [at] devnull> wrote:
> Anything that isn't parsable as a number is a name. Therefore,

> "0."    -> real
> ".0"    -> real
> "."     -> name
> "1.1"   -> real
> "1.1.1" -> name
> "1e1"   -> real
> "1x1"   -> name
> "s1"    -> name
> "1s"    -> name

> Will the insanity never end?? >_<

> Good luck writing a parser that can untangle all of that... :-(

  I really can't see the problem. When the input contains a sequence of
valid characters (ie. which can form a real or a name), if this sequence
has the form:

    ^[+-]?([0-9]+\.?|[0-9]*\.[0-9]+)(e[+-]?([0-9]+\.?|[0-9]*\.[0-9]+))?$

then it's a real, else it's a name.

  If we translate that regexp to plain English, it means:

- There's nothing before this pattern (which is what the ^ at the beginning
  means), and nothing after it (which is what the $ at the end means).
- The sequence optionally starts with a + or a -.
- After that two possible patterns must appear (the expression in
  parentheses, where the two patterns are separated with the | symbol):
  - A sequence of one of more digits ([0-9]+), optionally followed by
    the dot character (a plain "." has a special meaning in regexps, so the
    dot character has to be escaped, and thus written as "\.")
  - A sequence of zero or more digits ([0-9]*) followed by a dot character
    followed by a sequence of one of more digits.
- Optionally the character "e" can follow, and if that's the case, a real
  (not containing an "e") must follow as well (the whole last part in
  parentheses, with the "?" at the end to indicate optionality).

  In an actual BNF-style parser the rule probably becomes simpler because
the repetition can be removed.

-- 
                                                          - Warp


Post a reply to this message

From: Invisible
Subject: Re: Trixy
Date: 18 Dec 2008 11:36:58
Message: <494a7c2a@news.povray.org>
Warp wrote:
>> Good luck writing a parser that can untangle all of that... :-(
> 
>   I really can't see the problem. When the input contains a sequence of
> valid characters (ie. which can form a real or a name), if this sequence
> has the form:
> 
>     ^[+-]?([0-9]+\.?|[0-9]*\.[0-9]+)(e[+-]?([0-9]+\.?|[0-9]*\.[0-9]+))?$
> 
> then it's a real, else it's a name.

Yeah. As I said, I'm currently changing my design from one that attempts 
to recognise and delimit numbers to one that just chops the text into 
chunks, and *then* decides what kind of thing each chunk is.

>   If we translate that regexp to plain English, it means:
> 
> - There's nothing before this pattern (which is what the ^ at the beginning
>   means), and nothing after it (which is what the $ at the end means).
> - The sequence optionally starts with a + or a -.
> - After that two possible patterns must appear (the expression in
>   parentheses, where the two patterns are separated with the | symbol):
>   - A sequence of one of more digits ([0-9]+), optionally followed by
>     the dot character (a plain "." has a special meaning in regexps, so the
>     dot character has to be escaped, and thus written as "\.")
>   - A sequence of zero or more digits ([0-9]*) followed by a dot character
>     followed by a sequence of one of more digits.
> - Optionally the character "e" can follow, and if that's the case, a real
>   (not containing an "e") must follow as well (the whole last part in
>   parentheses, with the "?" at the end to indicate optionality).

...which would be incorrect then, for at least the following reasons:

- There can be *zero* or more characters before the decimal point. (But 
notice that there must be more than zero characters *in total* before 
and after the decimal point. It's just that there can be zero in either 
place, but not both.)

- The "e" can also be "E".

- The exponent is an integer, not a real.

See? Not as easy as it looks, is it? Gotta pay careful attention to 
*exactly* what the manual says is and isn't permissible.

>   In an actual BNF-style parser the rule probably becomes simpler because
> the repetition can be removed.

It would be *really nice* if the reference manual included a BNF syntax 
diagram... :-S

Between the rules for splitting up tokens, the tricky rules for escaping 
things in strings, and characters that have multiple meanings depending 
on context, it's really quite hard!

E.g., "<" is the start of a string, "<~" is the start of another string, 
and "<<" is an ordinary name object. Go figure. Similarly, "[" is 
classified as a "delimiter character", yet it's also the *name* of an 
operator (and a name is a sequence of "regular characters" - that is, 
can't contain delimiters).

It all gets confusing very fast...


Post a reply to this message

From: Eero Ahonen
Subject: Re: Really strange design choices
Date: 18 Dec 2008 11:55:24
Message: <494a807c@news.povray.org>
Orchid XP v8 wrote:
> 
> Why...why...WHY...why would they do this? o_O
> 

Why not? After all, it's possible.

-Aero


Post a reply to this message

From: Invisible
Subject: Re: Really strange design choices
Date: 18 Dec 2008 11:56:40
Message: <494a80c8@news.povray.org>
Orchid XP v8 wrote:

> Why...why...WHY...why would they do this? o_O

I wonder if anybody will figure out where this is quoted from...


Post a reply to this message

From: Mike Raiford
Subject: Re: Trixy
Date: 18 Dec 2008 12:06:11
Message: <494a8303$1@news.povray.org>
Invisible wrote:

> - There can be *zero* or more characters before the decimal point. (But 
> notice that there must be more than zero characters *in total* before 
> and after the decimal point. It's just that there can be zero in either 
> place, but not both.)

A quickie C# example ...

Code:


     class Program
     {

         static string[] tokens = new string[] {"0.", ".0", ".", "1.1", 
"1.1.1", "1e1", "1x1", "s1", "1s"};

         static void Main(string[] args)
         {

             Regex rx = new 
Regex(@"^[+-]?(([0-9]*(\.[0-9]+)?)|([0-9+]\.))(e[0-9]+)?$");

             foreach(string t in tokens)
             {
                 if(rx.Match(t).Success)
                 {
                     Console.WriteLine("\"{0}\" --> Real", t);
                 }
                 else
                 {
                     Console.WriteLine("\"{0}\" --> Identifier", t);
                 }
             }

             System.Console.ReadKey();
         }


Output:

"0." --> Real
".0" --> Real
"." --> Identifier
"1.1" --> Real
"1.1.1" --> Identifier
"1e1" --> Real
"1x1" --> Identifier
"s1" --> Identifier
"1s" --> Identifier

The relavent portion is this regex:

^[+-]?(([0-9]*(\.[0-9]+)?)|([0-9+]\.))(e[0-9]+)?$

Which does what Warp describes, but also takes care of the case with .0 
and 0.

> - The "e" can also be "E".

you can easily replace the e for exponent with [eE], which allows either 
lower or upper case. :)

> - The exponent is an integer, not a real.
> 
> See? Not as easy as it looks, is it? Gotta pay careful attention to 
> *exactly* what the manual says is and isn't permissible.
> 
>>   In an actual BNF-style parser the rule probably becomes simpler because
>> the repetition can be removed.
> 
> It would be *really nice* if the reference manual included a BNF syntax 
> diagram... :-S
> 
> Between the rules for splitting up tokens, the tricky rules for escaping 
> things in strings, and characters that have multiple meanings depending 
> on context, it's really quite hard!
> 
> E.g., "<" is the start of a string, "<~" is the start of another string, 
> and "<<" is an ordinary name object. Go figure. Similarly, "[" is 
> classified as a "delimiter character", yet it's also the *name* of an 
> operator (and a name is a sequence of "regular characters" - that is, 
> can't contain delimiters).
> 
> It all gets confusing very fast...


-- 
~Mike


Post a reply to this message

From: Warp
Subject: Re: Trixy
Date: 18 Dec 2008 12:22:16
Message: <494a86c7@news.povray.org>
Invisible <voi### [at] devnull> wrote:
> >     ^[+-]?([0-9]+\.?|[0-9]*\.[0-9]+)(e[+-]?([0-9]+\.?|[0-9]*\.[0-9]+))?$

> ...which would be incorrect then, for at least the following reasons:

> - There can be *zero* or more characters before the decimal point.

  Which is exactly what "[0-9]*\." means: Zero or more digits, followed
by the dot character.

> - The "e" can also be "E".

  That's easy to fix: Replace "e" in the regexp with "[eE]".

> - The exponent is an integer, not a real.

  Then it becomes simpler:

    ^[+-]?([0-9]+\.?|[0-9]*\.[0-9]+)([eE][+-]?[0-9]+)?$

  In fact, that pattern actually matches the floating point number
representation in C and C++ (except perhaps for possibility of a
preceding +).

> See? Not as easy as it looks, is it?

  It was not a question of difficulty. It was a question of me not knowing
the exact format of floating point numbers in PostScript and making
assumptions.

-- 
                                                          - Warp


Post a reply to this message

From: Darren New
Subject: Re: Trixy
Date: 18 Dec 2008 12:43:13
Message: <494a8bb1$1@news.povray.org>
Invisible wrote:
> There is also a whole bunch of "interesting" rules about how token 
> parsing works.

Sounds like a state machine to me. AKA a regular expression.

Sounds like your parser is too complex.

-- 
   Darren New, San Diego CA, USA (PST)
   The NFL should go international. I'd pay to
   see the Detroit Lions vs the Roman Catholics.


Post a reply to this message

From: Warp
Subject: Re: Trixy
Date: 18 Dec 2008 14:02:42
Message: <494a9e52@news.povray.org>
Darren New <dne### [at] sanrrcom> wrote:
> Sounds like a state machine to me. AKA a regular expression.

  Any regular expression can be converted into a state machine, but can
every state machine be converted into a regular expression? Is there a
one-to-one relation?

-- 
                                                          - Warp


Post a reply to this message

From: Darren New
Subject: Re: Trixy
Date: 18 Dec 2008 14:41:36
Message: <494aa770$1@news.povray.org>
Warp wrote:
>   Any regular expression can be converted into a state machine, but can
> every state machine be converted into a regular expression? Is there a
> one-to-one relation?

In the original definitions of those two terms, yes. Obviously, by the time 
you get to Perl regular expressions, they're no longer "regular" in the same 
sense. If you stick to regular expression algebra as invented by Kleene, 
they're equivalent specifications of the same languages. (One can wind up 
exponentially larger than the other, mind.)

-- 
   Darren New, San Diego CA, USA (PST)
   The NFL should go international. I'd pay to
   see the Detroit Lions vs the Roman Catholics.


Post a reply to this message

From: Nicolas Alvarez
Subject: Re: Really strange design choices
Date: 18 Dec 2008 15:26:12
Message: <494ab1e4@news.povray.org>
Orchid XP v8 wrote:
> Eero Ahonen wrote:
> 
>> I assume you are already aware of this, but I'll provide a link anyways:
>> 
>> http://circle.ch/blog/p558.html
> 
> Why...why...WHY...why would they do this? o_O

Search in povray.general about that guy who wanted to make a webserver in
POV SDL.

(I don't think it's possible without adding for example other I/O
facilities)


Post a reply to this message

<<< Previous 10 Messages Goto Latest 10 Messages Next 10 Messages >>>

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.