|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Warp wrote:
> Invisible <voi### [at] devnull> wrote:
>> For example, PostScript involves "dictionaries". A dictionary stores
>> key/value pairs. The key will usually be a "name" object. However,
>> reading the small-print, I discover that a key can *actually* be any
>> possible PostScript object.
>
> Welcome to object-oriented programming... :P
LOL!
Maybe you'll appreciate this... Did you know that PostScript includes
the tail-recursion optimisation?
--
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Invisible wrote:
> Other amusing edge cases include "/":
>
> - A name is usually executable; by preceeding it with "/", it becomes
> literal.
>
> - The toke "/" by itself (i.e., not preceeding a name) is a valid
> (executable) name.
>
> Trixy Hobbitses!
Also fun is trying to write a correct number parser:
- ".0" and "0." are both real number objects (equal to 0.0).
- "." by itself is a name object.
- PostScript allows both "-" and "+" as sign prefixes (which is good).
Haskell does not, however (which is bad).
Basically, getting the parser to accept only valid PostScript numbers,
and return the correct value, is proving to be very difficult! :-(
This is what happens when you try to interface your program with
real-world systems; not everything is simple and mathematically elegant,
and it makes stuff much harder to code. I'm not giving up though! >:-D
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Invisible wrote:
> Also fun is trying to write a correct number parser:
>
> - ".0" and "0." are both real number objects (equal to 0.0).
>
> - "." by itself is a name object.
>
> - PostScript allows both "-" and "+" as sign prefixes (which is good).
> Haskell does not, however (which is bad).
Now the joy of parsing strings.
Simple, right? RIGHT??
Ah, but you forget about *escape sequences*! >:-]
PostScript supports these:
- Strings are delimited with brackets, not quotes.
- Matched brackets work automatically. Unmatches ones must be escaped.
- \n, \t, etc.
- \\ is a backslash, \( and \) are brackets.
- \100 is an octal character code. (Oh what fun finding an octal
conversion function!)
- A backslash before a newline discards the newline character.
- Anything else preceeded by a backslash is just itself.
Man my head hurts... Again, 2% of the functionality takes 90% of the
implementation effort! >_<
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
>> Other amusing edge cases include "/":
>>
>> - A name is usually executable; by preceeding it with "/", it becomes
>> literal.
>>
>> - The toke "/" by itself (i.e., not preceeding a name) is a valid
>> (executable) name.
>>
>> Trixy Hobbitses!
>
> Also fun is trying to write a correct number parser:
>
> - ".0" and "0." are both real number objects (equal to 0.0).
>
> - "." by itself is a name object.
>
> - PostScript allows both "-" and "+" as sign prefixes (which is good).
> Haskell does not, however (which is bad).
Ah, but these interact!
Anything that isn't parsable as a number is a name. Therefore,
"0." -> real
".0" -> real
"." -> name
"1.1" -> real
"1.1.1" -> name
"1e1" -> real
"1x1" -> name
"s1" -> name
"1s" -> name
Will the insanity never end?? >_<
Good luck writing a parser that can untangle all of that... :-(
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
> Anything that isn't parsable as a number is a name. Therefore,
>
> "0." -> real
> ".0" -> real
> "." -> name
> "1.1" -> real
> "1.1.1" -> name
> "1e1" -> real
> "1x1" -> name
> "s1" -> name
> "1s" -> name
>
> Will the insanity never end?? >_<
>
> Good luck writing a parser that can untangle all of that... :-(
http://xkcd.com/208/
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
scott wrote:
> http://xkcd.com/208/
Seriously... You gotta love the way this guy manages to draw stick
figers that have no facial expressions, yet you can tell *exactly* what
emotion they're having! o_O
Also... Yes, I am very, very glad I'm not writing a parser for regular
expressions. (My God, think of the massacre...!)
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
>> http://xkcd.com/208/
>
> Seriously... You gotta love the way this guy manages to draw stick figers
> that have no facial expressions, yet you can tell *exactly* what emotion
> they're having! o_O
>
> Also... Yes, I am very, very glad I'm not writing a parser for regular
> expressions. (My God, think of the massacre...!)
I meant using regular expressions to help in your parser to decipher
numbers.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
>> Also... Yes, I am very, very glad I'm not writing a parser for regular
>> expressions. (My God, think of the massacre...!)
>
> I meant using regular expressions to help in your parser to decipher
> numbers.
I fail to see how a pattern matching language is of help here...
(I already *have* a real parser construction toolkit. The *problem* is
that the rules I'm trying to puzzle out are quite complex - and not
fantastically well-documented.)
Still, sooner or later I'll reach this stage:
http://xkcd.com/349/
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
>> I meant using regular expressions to help in your parser to decipher
>> numbers.
>
> I fail to see how a pattern matching language is of help here...
Well it seemed from your example, a "number" is quite easily distinguished
from a non-number.
A number takes one of the four forms (where n is 1 or more digits):
n.n
n.
.n
n
And is optionally prefixed by a minus sign, and optionally suffixed by an
exponential term, which takes the form E or e followed by an optional minus
sign followed by one or more digits.
I would use regular expressions to decide if my string matched this form or
not, but maybe your language/library already has similar functions to do
that?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
>> I fail to see how a pattern matching language is of help here...
>
> Well it seemed from your example, a "number" is quite easily
> distinguished from a non-number.
Yeah, maybe.
> A number takes one of the four forms (where n is 1 or more digits):
>
> n.n
> n.
> .n
> n
>
> And is optionally prefixed by a minus sign, and optionally suffixed by
> an exponential term, which takes the form E or e followed by an optional
> minus sign followed by one or more digits.
This isn't quite correct.
- The optional sign prefix can also be "+" instead of "-" (in both the
mantissa and any exponent there might be).
- Numbers may also take the form "n#n".
> I would use regular expressions to decide if my string matched this form
> or not, but maybe your language/library already has similar functions to
> do that?
Well, given that I already need to cut the string into bits anyway so I
can modify it so the number parser will accept it, I'm not sure this
buys me anything. (Haskell's number parser doesn't like "+" as a prefix,
doesn't like ".7" or "7." as a number, and so forth.)
There is also a whole bunch of "interesting" rules about how token
parsing works. A PostScript program can take an arbitrary text string
and ask the interpretter to parse one token from it. Page 703 of the
PostScript Language Reference Manual states the following facts:
- If the token read is a name object or a number object, and it is
followed by a white-space character, one whitespace character is consumed.
- If the token ends with a delimiter that's part of the token, that
delimiter is consumed, and no other characters after it.
- If the token is terminated by a delimiter that marks the start of the
next token, that character is not consumed.
In other words, if you have "123 456" then the space is consumed, but if
you have "<123> 456" then the space is *not* consumed. Likewise, if you
have "123/abc" then the "/" is not consumed. However, "123abc" is a
single (name object) token.
Looking at all these facts, it appears that the interpretter actually
uses some simple rule to break the whole input stream into "tokens", and
then decides what kind of token it is seperately.
I am now reimplementing my parser so that instead of trying to classify
and split the input at the same time, it splits it first, and only then
attempts to decide what it just read. I think this is probably how the
"real" PostScript interpretters work.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |