POV-Ray: Newsgroups: povray.programming: POV-Ray parser in Java: Re: POV-Ray parser in Java

POV-Ray : Newsgroups : povray.programming : POV-Ray parser in Java : Re: POV-Ray parser in Java		Server Time 12 Jul 2025 10:49:40 EDT (-0400)
From: Vadim Sytnikov
Date: 14 Jan 2003 09:42:35
Message: <3e2421db$1@news.povray.org>
"Thorsten Froehlich" <tho### [at] trfde> wrote:
>
> As we agreed on the scanner part already, here is another good point
against
> Yacc.  Obviously its grammars are hard to maintain and there is hardly a
> worthwhile speed gain, otherwise the December 27th announcement on
> <http://gcc.gnu.org/> would not make sense!

Well, interesting announcement, indeed.

The grammars are as hard to maintain as formal language definitions are. I
mean, the accurate description of language *semantics*, not just syntax.
This is a very complicated matter... I, personally, started my compiler
construction education with "The Disign and Construction of Compilers"
(Hunter, 1981), which was published just 4 years before the "Compilers:
Principles, Techniques, and Tools" (Aho, Sethi, Ullman; 1985). Just 4-year
span, and yet there were a fundamental difference: the first book's author
had no idea as to how to describe language semantics (vs syntax), at all.
There were some vague and virtually useless proposition of meta-grammars,
and that's all. While the second book introduced annotated grammars
(grammars with attributes). Here, I refer to the "Red Dragon" edition; I
have no idea whether earlier editions had that as well. What I am trying to
say is that, to the best of my knowledge, attributed grammars are the only
way (discovered to date) to accurately represent language *semantics*.

The annotated (attributed) grammars are essentially what Yacc/Bison and
similar systems implement. The "attributes" (from the compiler construction
theory) are those "rules" enclosed in curly braces that you put into the
Yacc rules. So, as to maintaining language *definition*, Yacc rules are
unbeatable -- respective implementations are as close to theory as possible.
Consequently, there is a possibility to automatically generate highly
efficient language parsers, far more efficient than a casual compiler writer
may produce. Or course, that may not be the case with those Los Alamos guys.
In this respects, to maintain POV-Ray grammar as a Yacc one still seems to
be highly desirable -- IMHO.

One thing that used to always bother me a lot was the extreme difficulty of
maintaining the "context" in bottom-up analysis, i.e. in Yacc/Bison
grammars... But as soon as I got accustomed to the use of fake non-terminals
to track context, life became significantly easier. To me, that is still the
only noticable weakness of real-life LR1/LALR1 grammars used in bottom-up
analysis. I bet those guys used *much* more hacks to make a working C++
LL1-based parser... Once you start to write recursive-descend parser for a
real-world application, you quickly notice that you are forced to patch it
more and more, pass more and more information to and from those recursive
calls (I suspect that that is what "hand-crafted" stands for in that
announcement :-) -- in strict accordance with the theory, that states that
LL1 grammars are suitable for a much smaller set of languages than LALR1
(let alone LR1). Fortunately, as I noted earlier, this does not have such a
strong impact on POV-Ray -- due to its more or less linear nature. BTW, how
do you think: to what extent the linear POV-Ray SDL structure is due to its
LL1-constrained parser? :-)

Well, back to practice... I sincerely think that a good profiling of
POV-Ray's *scanner* has to be done; I promise to do that (again) myself, one
day, as I have time. To date, speaking of parse time, the only thing that
*really* got in my way was parsing of really big meshes -- which obviously
cannot be due to parser's design, but just because of its implementation.
Maybe you're right with your getc()/ungetc() observation. If that is indeed
so, the radical solution might as well include the implementation of 3DS
mesh readers (why not? we do use binary images, and there seems to be no
fundamental differences). Other (non-mesh) things just parse much longer
that I would have liked; in acceptable time, but much longer than they ought
to. *This* might as well be due to scanner design; so some profiling is due,
definitely.
Post a reply to this message