POV-Ray: Newsgroups: povray.programming: Why not generate parser with Bison & Flex?

POV-Ray : Newsgroups : povray.programming : Why not generate parser with Bison & Flex?		Server Time 29 Jul 2024 04:28:09 EDT (-0400)

Goto Latest 10 Messages

Next 4 Messages >>>

From: Evan Powers
Subject: Why not generate parser with Bison & Flex?
Date: 29 Dec 1998 21:43:01
Message: <36899335.0@news.povray.org>

As an in-my-free-time project I'm writing a structured database system that
lets remote users log into a server then read and edit the database. I've
created my own mini language for storing the database, and have discovered
that I really don't want to write a parser for it. It would probably be kind
of fun, but hey, I don't have the time. Anyway, I discovered the UNIX
utilities Bison and Flex, and the man pages tell me these utilities will
generate a lexer and parser for me from a "source" file containing the
language grammar. I've read most of Bison's info file, and it looks like
these utilities would make writing even a parser for the C & C++ programming
languages easy.

Soon after, what pops into my head? Something a friend working on his own
custom version of POV told me... something about weird macros in the
parser.... Anyway, I downloaded the source and took a look, and discovered
that, indeed, the parser consisted of cryptic code and lots of funny macros.

The point:

Has anyone considered redoing POV's parser entirely using Bison and Flex?
Making a language grammar file, from what I know, would be child's play
compared to understanding and changing the existing code. Plus it would make
maintaining the language grammar (and thus the parser) ridiculously easy.
Both programs are part of GNU (they're free with source like POV), and I
know that ports exist for Win32 and MS-DOS as well as UNIX, so lack of tools
isn't an excuse. It might even speed parse times, since parsers and lexers
made by these tools supposedly run at several hundred thousand lines per
minute for even complex languages.

I'd almost be willing to undertake the task myself, and might eventually if
no one else does, except I have lots of other, less recreational, things to
do. It seems like such an elegant solution in terms of ease of maintenance
and comprehension; I hope the POV development team looks into it.

--Evan Powers
EPT### [at] aolcom

Post a reply to this message

From: Thorsten Froehlich
Subject: Re: Why not generate parser with Bison & Flex?
Date: 30 Dec 1998 15:56:53
Message: <368a9395.0@news.povray.org>

In article <36899335.0@news.povray.org> , "Evan Powers" <ept### [at] aolcom> wrote:

>As an in-my-free-time project I'm writing a structured database system that
>lets remote users log into a server then read and edit the database. I've
>created my own mini language for storing the database, and have discovered
>that I really don't want to write a parser for it. It would probably be kind
>of fun, but hey, I don't have the time. Anyway, I discovered the UNIX
>utilities Bison and Flex, and the man pages tell me these utilities will
>generate a lexer and parser for me from a "source" file containing the
>language grammar. I've read most of Bison's info file, and it looks like
>these utilities would make writing even a parser for the C & C++ programming
>languages easy.

Note that these tools are designed for defined languages, not constantly extended
languages like the POV-Ray scene description language, but before I continue, please
note that I never used Bison and Flex myself, but I have seen code generated by them.
And the parser is only a (minor) part, you still the "compiler" part for the language
- and this goes for POV-Ray as well, see below.

>Soon after, what pops into my head? Something a friend working on his own
>custom version of POV told me... something about weird macros in the
>parser.... Anyway, I downloaded the source and took a look, and discovered
>that, indeed, the parser consisted of cryptic code and lots of funny macros.

Why are these macros funny or cryptic - once you got used to them they really ease
reading the parser code (I think)?

#define EXPECT { int Exit_Flag; Exit_Flag = false; \
 while (!Exit_Flag) {Get_Token();  switch (Token.Token_Id) {
#define CASE(x) case x:
#define CASE2(x, y) case x: case y:
#define CASE3(x, y, z) case x: case y: case z:
#define CASE4(w, x, y, z) case w: case x: case y: case z:
#define CASE5(v, w, x, y, z) case v: case w: case x: case y: case z:
#define CASE6(u, v, w, x, y, z) case u: case v: case w: case x: case y: case z:
#define END_CASE break;
#define EXIT Exit_Flag = true;
#define OTHERWISE default:
#define END_EXPECT } } }
#define GET(x) Get_Token(); if (Token.Token_Id != x) Parse_Error (x);
#define ALLOW(x) Get_Token(); if (Token.Token_Id != x) Unget_Token();
#define UNGET Unget_Token();
#define CASE_FLOAT CASE2 (LEFT_PAREN_TOKEN, FLOAT_FUNCT_TOKEN)\
 CASE2 (PLUS_TOKEN, DASH_TOKEN) UNGET
#define CASE_VECTOR CASE2 (VECTOR_FUNCT_TOKEN,LEFT_ANGLE_TOKEN) \
 CASE2 (U_TOKEN,V_TOKEN) CASE_FLOAT
#define CASE_EXPRESS CASE_VECTOR
#define CASE_COLOUR CASE3 (COLOUR_TOKEN,COLOUR_KEY_TOKEN,COLOUR_ID_TOKEN) UNGET

>The point:
>
>Has anyone considered redoing POV's parser entirely using Bison and Flex?
>Making a language grammar file, from what I know, would be child's play
>compared to understanding and changing the existing code. Plus it would make
>maintaining the language grammar (and thus the parser) ridiculously easy.

Well, making a grammar file does not write the code 'converting' the parser output
into POV-Ray's internal data structures, and as this the part the parser code you see
as difficult to understand does very well and even transparent to some extend.
However, it is true that the POV-Ray parser is not the absolute optimal design, but
it is more flexible and significantly easier to extend than Bison generated code (if
you want to do so by hand...).

>Both programs are part of GNU (they're free with source like POV), and I
>know that ports exist for Win32 and MS-DOS as well as UNIX, so lack of tools
>isn't an excuse. 

Yes, Bison and Flex are (always) available on Unix, there are a lot of different
implementations (and version) on different platforms, and if someone on one platform
wants to extend the language and uses a different version of Bison and Flex ther will
be different code each time someone else generates the code! This would be very
confusing and not work (easy) for a cross platform application *and* development like
POV-Ray.

>It might even speed parse times, since parsers and lexers
>made by these tools supposedly run at several hundred thousand lines per
>minute for even complex languages.

Yes, the code they output is fast, but (very well) handwritten, speciallized parsers
will most likely be much faster even languages like C++, it is just that handwritten
parsers for such languages are not as common because these languages are "closed"
(now), while the POV-Ray scene language is "open" (and will always be).  While a
parser for example for C++ will only have to be generated once, and the compiler code
itself will do the rest, for POV-ray this will result in: Run Flex and Bison,
integrate it with the POV-Ray code, check if everything still works, add the code
supporting the extension to the language is the data setup code, recompile and hope
there will not be the need for a last minute extension like the "material" syntax in
3.1...for the current, handwritten parser you learn to use (or print them out) the
few(!!!) macros and make an extension like the "material" statement with just a few
lines of code to change and *no* further work.

>I'd almost be willing to undertake the task myself, and might eventually if
>no one else does, except I have lots of other, less recreational, things to
>do. It seems like such an elegant solution in terms of ease of maintenance
>and comprehension; I hope the POV development team looks into it.

Well, I think Thomas Baier (in the team now) has done some of this work some time
ago.

     Thorsten

Post a reply to this message

From: Evan Powers
Subject: Re: Why not generate parser with Bison & Flex?
Date: 31 Dec 1998 18:13:51
Message: <368c0618.27900056@news.povray.org>

I've moved some of the quotes around, by the way.

On Wed, 30 Dec 1998 19:27:20 +0100, "Thorsten Froehlich"
<fro### [at] charliecnsiitedu> wrote:

>Note that these tools are designed for defined languages, not constantly extended
>languages like the POV-Ray scene description language, but before I continue, please
>note that I never used Bison and Flex myself, but I have seen code generated by them.

Sorry, you are mistaken. If you read up on the Bison grammar
description language (http://www.gnu.org/manual/bison/index.html) you
will realize that it is designed with a dynamic language definition in
mind. Adding new language features and changing existing ones is
simplistic.

As an example, here is a grammar file (from the Bison manual, with
minor modifications) specifying how an expression can be built from
subexpressions and operators. (Bison generates a function that parses
expressions, evaluates them, and prints the result from this file,
assuming there is an acompanying lexer.)
	/* Infix notation four-function calculator */
	%{
		#define YYSTYPE double
		#include <math.h>
	%}

	%token NUM	/* a number */
	%left '-' '+'
	%left '*' '/'

	%%

	exp:	NUM		{ $$ = $1; }
		| exp '+' exp	{ $$ = $1 + $3; }
		| exp '-' exp	{ $$ = $1 - $3; }
		| exp '*' exp	{ $$ = $1 * $3; }
		| exp '/' exp	{ $$ = $1 / $3; }
		| exp '\n'	{ printf ("%.10g\n", $1); };
	<EOF>

Note that this grammar file fully defines the operator precedence and
associativity.

Say I want to extend the expression parser to include some additional
operators:
	/* Infix notation calculator, more robust */
	%{
		#define YYSTYPE double
		#include <math.h>
	%}

	%token NUM	/* a number */
	%left '-' '+'
	%left '*' '/'
	%left NEG	/* negation--unary minus */
	%right '^'	/* exponentiation        */

	%%

	exp:	NUM			{ $$ = $1; }
		| exp '+' exp		{ $$ = $1 + $3; }
		| exp '-' exp		{ $$ = $1 - $3; }
		| exp '*' exp		{ $$ = $1 * $3; }
		| exp '/' exp		{ $$ = $1 / $3; }
		| '-' exp  %prec NEG	{ $$ = -$2; }
		| exp '^' exp		{ $$ = pow ($1, $3); }
		| '(' exp ')'		{ $$ = $2; };
		| exp '\n'		{ printf ("%.10g\n", $1); };
	<EOF>

I concede that this is not the best example; however, it tells enough
about the Bison grammar language to allow the extrapolation of how
modifications to the POV grammar could be made.

>And the parser is only a (minor) part, you still the "compiler" part for the language
>- and this goes for POV-Ray as well, see below.

>Well, making a grammar file does not write the code 'converting' the parser output
>into POV-Ray's internal data structures, and as this the part the parser code you see
>as difficult to understand does very well and even transparent to some extend.

It is important to realize that, as defined by Bison, only the code
within the POV parser specifing what language element should come next
is part of the parser; the rest of the code, the part that builds the
internal data structures, is *not*. If the POV parser were written
with Bison, this code would be a Bison action (such as "{$$ = $1 +
$3}" above) instead of being embedded within the parsing logic. Thus,
a Bison POV parser would be just as stand-alone as PARSE.C.

I grabbed the following code snippet out of PARSE.C from the POV
source archive. I have marked the lines that are actually part of the
parser with a '%'.
%	CASE (SCALE_TOKEN)
%		Parse_Scale_Vector (Local_Vector);
		Compute_Scaling_Transform(&Local_Trans, Local_Vector);
		for (Current=First; Current!=NULL;
Current=Current->Sibling)
		{
			Scale_Object (Current, Local_Vector,
&Local_Trans);
		}
%	END_CASE

Just as making a grammar file does not write the code building POV's
internal data structures, writing the following using the existing
parser macros doesn't either:
	CASE (SCALE_TOKEN)
		Parse_Scale_Vector (Local_Vector);
	END_CASE

>Why are these macros funny or cryptic - once you got used to them they really ease
>reading the parser code (I think)?

Bearing my above statements in mind, I believe the argument is about
which grammar description language is superior. I, for one, belive a
language separating the grammar definition and the interpretation code
is superior to one in which the two are intermixed. Furthermore, I
belive that defining a language according to its overall structure,
then the overall structure of its components, then the overall
structure of the components' components, and so on, makes more
intuitive sense and is easier to read and maintain than a definition
of a language based upon what can follow each particular fundamental
element.

Which of these makes more sense? (Which is more concise?)
A) Bison-style.
	sentance:
		subject action predecate
	subject:
		[noun-modifiers] noun
	action:
		[verb-modifiers] verb
	predecate:
		[direct-object]
		| indirect-object direct-object
		| direct-object [indirect-object]
B) Current POV parser style.
	sentance:
		[noun-modifiers] noun-modifiers-followers
	noun-modifiers-followers:
		noun noun-followers
	noun-followers:
		[verb-modifiers] verb-modifiers-followers
	verb-modifiers-followers:
		verb verb-followers
	verb-followers:
		[direct-object]
		| indirect-object io-followers
		| direct-object do-followers
	io-followers:
		direct-object
	do-followers:
		[indirect-object]

>However, it is true that the POV-Ray parser is not the absolute optimal design, but
>it is more flexible and significantly easier to extend than Bison generated code (if
>you want to do so by hand...).

You don't extend the Bison generated code; you extend the original
grammar definition file. (I will grant you that one would not want to
manualy generate a parser from a Bison grammar definition file. But
then, that's what Bison is for!) Look again at the above grammar
definition methods. Which do you think is easier to extend? If you're
unsure, try modifying each definition to state that both a direct
object and an indirect object can consist of optional noun modifiers
followed by a noun.

>Yes, Bison and Flex are (always) available on Unix, there are a lot of different
>implementations (and version) on different platforms, and if someone on one platform
>wants to extend the language and uses a different version of Bison and Flex ther will
>be different code each time someone else generates the code!

GNU Bison generates the same code regardless of the host system type,
and is the de facto standard for parser generators. For the sake of
argument, however, I will assume that POV developers use several
versions of Bison.

With this assumption, your statement is true--but then, it doesn't
really matter. If someone wanted to extend the language, they would
modify the grammar definition file, *not* the generated parser. In
fact, I see no conceivable reason for anyone to even *look* at the
generated parser. (Not even someone debugging a Bison parser looks at
the generated parser code.) The file destributed in the POV source
archive would be the Bison grammar file, not the generated parser
file.

As for the fact that different versions generate different code:
All Bison-style parser generators use the same fundamental algorithm.
Even Yacc, Bison's ancestor, uses the same algorithm. The only
significant differences in the generated code are those caused by
whether the generated code is optimized for speed or size. I contend
that these differences are inconsequential, since no one actually
looks at the generated code anyway and since different command line
options can generate comparable code with different Bison versions.
Furthermore, the POV development team is unconcerned about the fact
that Borland, Watcom, Microsoft, and GNU compilers generate different
code from the same platform independent source file, and that, for
example, a LIBPNG compiled by Borland C cannot be used with a version
of POV compiled by GNU C. (I contend that there is no fundamental
difference between these two situations, and therefore are
comparable.)

>This would be very confusing and not work (easy) for a cross platform application
>*and* development like POV-Ray.

This is not true. A given Bison grammar file is platform independent,
since Bison generates platform independent ANSI C, provided the
user-supplied C code within its actions is platform independent, just
as a parser using the POV macros is platform independent provided the
code generating the internal data structures is platform independent.

>Yes, the code they output is fast, but (very well) handwritten, speciallized parsers
>will most likely be much faster even languages like C++, it is just that handwritten

Knowing what I do about the Bison algorithm, I very much doubt that a
handwritten parser written in the style of PARSE.C could in any way
compare in terms of speed. Read the chapters in the Bison manual about
the Bison algorithm to see what I mean. (The URL is above.)

>parsers for such languages are not as common because these languages are "closed"
>(now), while the POV-Ray scene language is "open" (and will always be).  While a

(Incidentally, I believe the C and C++ language standards are just as
"open" as the POV-Ray scene language. $18 per electronic copy of the
C++ language standard--price from www.nssn.org--doesn't sound like the
price of a closed standard to me. Besides, that explains why AT&T
isn't the only company who can legally sell a C or C++ compiler.)

>parser for example for C++ will only have to be generated once, and the compiler code

Haven't you heard of templates, exception handling, and RTTI (Run Time
Type Identification)? All are *very* recent additions to the C++
language standard.

>itself will do the rest, for POV-ray this will result in: Run Flex and Bison,
>integrate it with the POV-Ray code, check if everything still works, add the code
>supporting the extension to the language is the data setup code, recompile and hope
>there will not be the need for a last minute extension like the "material" syntax in

Granted, the initial coversion of PARSE.C into a Bison grammar file
will be considerable work. However, it only has to be done once, and
is more easily modified to include additional language elements than
the current macro-based system.

>3.1...for the current, handwritten parser you learn to use (or print them out) the
>few(!!!) macros and make an extension like the "material" statement with just a few
>lines of code to change and *no* further work.

I fail to understand how the work involved in modifying a Bison
grammar description file is greater than that involved in modifying
PARSE.C.

Besides, your position is like arguing that AT&T should never have
abandoned Lparse in favor of a direct C++ to assembler implementation
because Lparse was perfectly usable, even though a direct
implementation had significant advantages. (Yes, C++ was originally
implemented with a C preprocessor.)

>Well, I think Thomas Baier (in the team now) has done some of this work some time
>ago.

I would be interested in knowing more about this....

--Evan Powers
EPT### [at] aolcom

Post a reply to this message

From: Ronald L Parker
Subject: Re: Why not generate parser with Bison & Flex?
Date: 1 Jan 1999 12:48:46
Message: <368f08d2.70060924@news.povray.org>

On Tue, 29 Dec 1998 21:48:08 -0500, "Evan Powers" <ept### [at] aolcom>
wrote:

>Has anyone considered redoing POV's parser entirely using Bison and Flex?
>Making a language grammar file, from what I know, would be child's play
>compared to understanding and changing the existing code. Plus it would make
>maintaining the language grammar (and thus the parser) ridiculously easy.

Maintaining and understanding the grammar is already ridiculously 
easy, or at least as easy as understanding the bison definition-file 
syntax.  The only thing that makes it slightly more difficult is that
here is little if any documentation of the current parser structure.

One thing that will get in the way of a regular grammar for the POV 
language is the #version directive, which can change the entire 
grammar, so you might need multiple grammars and a clean way to switch

between them.

However, the idea of rewriting the parser in flex/bison does have some
merit when considered against the background of Chris Young's recent 
posting to cgrr: everyone wants a standalone parser, but the current 
parser is both severely intermixed with the existing code and covered 
by the Team's solemn promises that it would never see commercial
misuse.  A new parser, especially one written by someone who was
unconcerned about commercial use (and thus covered by an entirely new
version of POVLEGAL), might not suffer from those constraints.

Post a reply to this message

From: Thomas Baier
Subject: Re: Why not generate parser with Bison & Flex?
Date: 1 Jan 1999 13:12:51
Message: <368D1085.13BAC9CB@ibm.net>

Hi,

i have written a POV 2.2 parser with Lex&Yacc some years ago. You could check
the results at my homepage:
http://ourworld.compuserve.com/homepages/thbaier. It is a pov to mdl (Moray)
convertor.

Please keep in mind: POV 2.2 syntax was a little bit simpler than 3.1.
Anyway, the main problem with L&Y was a shift/reduce problem during comma
parsing.
There a lot of expression you could set a comma but you do not have to.




I would agree a L&Y grammar but the facts stand against it.

Ok there are solutions to get a L&Y grammerand parser, but it has not a high
priority. You have to write a lot of convertor code to translate old 3.1 files
into new. A lot of work.

But if someone has a L&Y grammar + code to translate old 3.1 syntax into new
syntax, no problem we will discuss the solution.

Thomas Baier

Post a reply to this message

From: Nieminen Mika
Subject: Re: Why not generate parser with Bison & Flex?
Date: 1 Jan 1999 14:33:33
Message: <368d230d.0@news.povray.org>

Thomas Baier <tho### [at] ibmnet> wrote:
: Please keep in mind: POV 2.2 syntax was a little bit simpler than 3.1.
: Anyway, the main problem with L&Y was a shift/reduce problem during comma
: parsing.
: There a lot of expression you could set a comma but you do not have to.




  I really like povray's syntax flexibility. For example, intead of having
to type:
  light_source { <1,2,3>, color rgb <1,1,1> }
you can type
  light_source { <1,2,3>, rgb <1,1,1> }
or
  light_source { <1,2,3>, <1,1,1> }
or
  light_source { <1,2,3>, 1 }
or
  light_source { <1,2,3> 1 }
or even
  light_source { <1,2,3>1 }

  Or another example: Instead of having to type:
  box { <-1,-1,-1>,<1,1,1> }
you can just type
  box { -1,1 }
or even
  box { -1 1 }
(although I don't really use this one)

  Don't ever change this to a less flexible syntax, please :)

-- 
main(i){char*_="BdsyFBThhHFBThhHFRz]NFTITQF|DJIFHQhhF";while(i=
*_++)for(;i>1;printf("%s",i-70?i&1?"[]":" ":(i=0,"\n")),i/=2);} /*- Warp. -*/

Post a reply to this message

From: Ken
Subject: Re: Why not generate parser with Bison & Flex?
Date: 1 Jan 1999 16:38:28
Message: <368D3FEC.1E092439@pacbell.net>

Nieminen Mika wrote:

>   I really like povray's syntax flexibility. For example:
>   light_source { <1,2,3>1 }

     light_source{y*1 1} Tested to work :^ }

>   Don't ever change this to a less flexible syntax, please :)

Agreed !

--
Ken Tyler

tyl### [at] pacbellnet

Post a reply to this message

From: Thomas Baier
Subject: Re: Why not generate parser with Bison & Flex?
Date: 2 Jan 1999 01:57:29
Message: <368DC3BC.48BAAB0B@ibm.net>

Hi,


>   I really like povray's syntax flexibility. For example, intead of having
> to type:
>   light_source { <1,2,3>, color rgb <1,1,1> }
> you can type
>   light_source { <1,2,3>, rgb <1,1,1> }
> or
>   light_source { <1,2,3>, <1,1,1> }
> or
>   light_source { <1,2,3>, 1 }
> or
>   light_source { <1,2,3> 1 }
> or even
>   light_source { <1,2,3>1 }
>

Well, no problem to support your examples with L&Y but you would have to follow a
more restrict comma setting.

-tb

Post a reply to this message

From: Thorsten Froehlich
Subject: Re: Why not generate parser with Bison & Flex?
Date: 4 Jan 1999 16:55:55
Message: <369138eb.0@news.povray.org>

Sorry for taking so long to reply...

In article <368c0618.27900056@news.povray.org> , EPT### [at] aolcom (Evan Powers)
wrote:
>Sorry, you are mistaken. If you read up on the Bison grammar
>description language (http://www.gnu.org/manual/bison/index.html) you
>will realize that it is designed with a dynamic language definition in
>mind. Adding new language features and changing existing ones is
>simplistic.
>
>It is important to realize that, as defined by Bison, only the code
>within the POV parser specifying what language element should come next
>is part of the parser; the rest of the code, the part that builds the
>internal data structures, is *not*. If the POV parser were written
>with Bison, this code would be a Bison action (such as "{$$ = $1 +
>$3}" above) instead of being embedded within the parsing logic. Thus,
>a Bison POV parser would be just as stand-alone as PARSE.C.

Well, I have not worked with it, so I can only state on what I see. Your link was
very useful (I did not know that this pages conatins docs for most Unix tools...)
and after doing some reading, I think you are mostly right.
I am now very concerned about debugging the resulting code and platform support. I
did some experiments with an (not longer supported/not updated for a long time) old
plug in for my Mac compiler - first I had to seach for it for about half an hour
(most links to it were oudated...) and then remembered that I had some backup when I
downloaded it some time ago. As a matter of fact, I am currently not able to find any
Bison (which is maintained) for the Mac at all. Of course I found a few dozend Win
and Unix implementation links during my search: Currently the POV-Ray source code
compiles and allows changed with every (supported) C compiler, nobody needs to search
for hard to find 3rd party (also Flex and Bison are free) tools for its platform or
get the source code and compile it himself.
However, I managed to get an example file that came with it to work and generate a
compiling parser. After that I started to try do make a minor change and a typo (by
me) in some C interface code made me look into the parser file because Bison never
complained...
I realze that this is a beginners mistake, but I have been a C programmer for about 8
years now and never had to use Flex or Bison, so I assume many people who want to
make some changes to the public POV-Ray source code don't know Bison or Flex at all
(if programming is just a hobby for them), and having to *learn* the Flex and Bison
syntax before being able to extend POV-Ray does not make it easier for them.

>Note that this grammar file fully defines the operator precedence and
>associatively.

I assumed it would :-)

>I concede that this is not the best example; however, it tells enough
>about the Bison grammar language to allow the extrapolation of how
>modifications to the POV grammar could be made.

Yes, and I got it.

>I grabbed the following code snippet out of PARSE.C from the POV
>source archive. I have marked the lines that are actually part of the
>parser with a '%'.
>% CASE (SCALE_TOKEN)
>%  Parse_Scale_Vector (Local_Vector);
>  Compute_Scaling_Transform(&Local_Trans, Local_Vector);
>  for (Current=First; Current!=NULL;
>Current=Current->Sibling)
>  {
>   Scale_Object (Current, Local_Vector,
>&Local_Trans);
>  }
>% END_CASE
>
>Just as making a grammar file does not write the code building POV's
>internal data structures, writing the following using the existing
>parser macros doesn't either:
> CASE (SCALE_TOKEN)
>  Parse_Scale_Vector (Local_Vector);
> END_CASE
>
>Bearing my above statements in mind, I believe the argument is about
>which grammar description language is superior. I, for one, believe a
>language separating the grammar definition and the interpretation code
>is superior to one in which the two are intermixed. Furthermore, I
>believe that defining a language according to its overall structure,
>then the overall structure of its components, then the overall
>structure of the components' components, and so on, makes more
>intuitive sense and is easier to read and maintain than a definition
>of a language based upon what can follow each particular fundamental
>element.

Yes, you are right that the parsing part does not build anything, but please pay
attention to this little detail:
Look at the lines you did not mark, what do they do?  They reduce the memory usage!
There can possibly be thousands of "scale" *commands* (they force POV-Ray to scale
the object, so they are a sort of command). If the parser code would just be composed
of the lines you marked, you would have to store and execute the "scale" commands
later (after parsing) which creates the need for higher memory usage - and now assume
200000 objects (common for some users) with only one scale command each, and a scale
command requires a minimum of 24 bytes to store (if you use 8 byte floating point as
POV-Ray usually does): You just created the need for additional 4.8 MB of (temporary)
memory! What happens if e.g. an automatically generated scene (by a 3rd party
program) generates 200000 objects and five scale commands per object...? The only way
out of this would be to integrate the lines you did not mark into the parser code
again, and one of the advantages of the Bison parser would be lost again!
All these lines of code would have to be written inside the grammar file, wouldn't
they? And this would make the grammar file a total nightmare!!!  Or how would you
solve this problem, maybe I am totally wrong!?!

>Which of these makes more sense? (Which is more concise?)
>A) Bison-style.
> sentence:
>  subject action predicate
> subject:
>  [noun-modifiers] noun
> action:
>  [verb-modifiers] verb
> predicate:
>  [direct-object]
>  | indirect-object direct-object
>  | direct-object [indirect-object]

Yes, the Bison approach is superior in hidding the complexity created by the io and
"command" code, but you need temporary storage of the data and then modify it, store
it somewhere else (differnt data structure) and the free the temporary memory.

>B) Current POV parser style.
> sentence:
>  [noun-modifiers] noun-modifiers-followers
> noun-modifiers-followers:
>  noun noun-followers
> noun-followers:
>  [verb-modifiers] verb-modifiers-followers
> verb-modifiers-followers:
>  verb verb-followers
> verb-followers:
>  [direct-object]
>  | indirect-object io-followers
>  | direct-object do-followers
> io-followers:
>  direct-object
> do-followers:
>  [indirect-object]

Yes, the POV-Ray scene language contains not only data structures but also commands
(I (still) can't find a better word, it is what you called "do") how to modify this
data. This is a major difference to the common programming languages which allow easy
parsing from the top to the bottom, while POV-Ray is self modifying code of some kind
(or some kind of hidden preprocessor): If you parse C you can parse tokens like a
stream, nothing (OK, this is oversimplified) you parse later in the stream of tokens
will require you to *change* data you created earlier.
Take an expression tree generation parser as example (assuming we don't do the
parsing and code generation on the fly): You build the tree, node by node through
recursion which is no problem, you get code by just stepping through the tree later.
You write the expression code by parsing the tree in a different way (so you can get
reasonable assembler code perhaps).
Now look at this expression tree as some kind of more complex expression, a typical
POV-Ray scene object like a sphere: You parse the sphere and now you would usually
generate the output or store it to generate the output later. For this the Bison
parser would work *very* well and surely be faster than the POV-Ray parser, both just
generate the sphere object data structures and fill in the data. And now, there is
e.g. a scale command found!

If you compare the Bison and the POV-Ray parser code you will see the following:

* In Bison parser code (or better, in the grammar file) you intended to keep the data
modifying code out! Now you have to *store* the commands (or actions) that change the
data. You will then go through all this data later, lookup each object, check if
there are scale commands for it and execute them. Then you free the commands data
structures.
* The current POV-Ray parser never had to store these commands and never used this
additional memory. It applied the scale command "on the fly".

And here starts the next problem to surface: Error and warning handling.
If a user made a mistake and e.g. wrote "scale <0,0,0>" which makes no sense, the
POV-Ray parser will find this problem in place and have no problems to report it to
the user. This is especially useful because POV-Ray scene objects have no names like
functions in programming languages, there is no abstract location name, only a line
number which identifies the scene object!
Of course you will/should do this basic error checking in the parser code, but you
cannot find them all, assume some crazy user (or 3rd party program) would apply ten
"scale <0.00001,0.00001,0.00001>" commands to an object by accident. Each of these
scale commands is valid by itself, but doing them ten times you may get an underflow
and the scale would again be <0,0,0>!  If this happens in the first of 200000 objects
the POV-Ray parser will report this to the user immediatelly, but the Bison parser
cannot even know that something is wrong here!  In order to report the problem later
you would have to store the file and line each of these scale commands - even more
memory would be needed!

>You don't extend the Bison generated code; you extend the original
>grammar definition file. (I will grant you that one would not want to
>manually generate a parser from a Bison grammar definition file. But
>then, that's what Bison is for!)

You did not get my whole point: Even a small change to the grammar and the whole
parser has to be recompiled, I don't want to build a parser from the grammar file,
just extend the parser code by hand.

>GNU Bison generates the same code regardless of the host system type,
>and is the de facto standard for parser generators. For the sake of
>argument, however, I will assume that POV developers use several
>versions of Bison.

If some older Bison has a bug, it may not be a problem on one platform, but another
platform would have it...this is a debugging and bug report nightmare - then even the
core code would be *different* (even if its just a bit) for each platform. And don't
forget compiler bugs, etc! In an ideal situation, of course, this would not be a
problem, but unfortunately this isn't the case :-(

>With this assumption, your statement is true--but then, it doesn't
>really matter. If someone wanted to extend the language, they would
>modify the grammar definition file, *not* the generated parser. In
>fact, I see no conceivable reason for anyone to even *look* at the
>generated parser. (Not even someone debugging a Bison parser looks at
>the generated parser code.) The file distributed in the POV source
>archive would be the Bison grammar file, not the generated parser
>file.

This would require someone in the team to always generate one parser for everybody in
the team, the only way to make sure the bugs are out everywhere.  If this would not
be done, each platform developers would also have to make sure that on their platform
the *whole* parser works as expected - and at least on the Mac we have other things
to do... :-)

>As for the fact that different versions generate different code:
>All Bison-style parser generators use the same fundamental algorithm.

Hmm, this isn't a good argument, ray tracers all use the "same fundamental algorithm"
as well - and the differences are quite *visible*.  However, it would be unfair  if I
would say this is the same case for Bison, the differences are surely less important,
but *bugs* are not.  And even the POV-Ray C source code has to work around several
compiler bugs.  And fixing them in the Bison source code is no reasonable solution.

>Even Yacc, Bison's ancestor, uses the same algorithm. The only
>significant differences in the generated code are those caused by
>whether the generated code is optimized for speed or size. I contend
>that these differences are inconsequential, since no one actually
>looks at the generated code anyway and since different command line
>options can generate comparable code with different Bison versions.

Yes, this is true. 

(Command line options are another problem: E.g. Macs don't have a command line (but
of course I can work around this) and getting the same options would still be
difficult on different platforms.)

>Furthermore, the POV development team is unconcerned about the fact
>that Borland, Watcom, Microsoft, and GNU compilers generate different
>code from the same platform independent source file, and that, for
>example, a LIBPNG compiled by Borland C cannot be used with a version
>of POV compiled by GNU C. (I contend that there is no fundamental
>difference between these two situations, and therefore are
>comparable.)

Yes, I totally agree that the resulting compiled code is no concern. (Just a small
note: On the Mac libraries and PPC code is exchangeable to some degree, and the same
would be possible everywhere else, but POV-Ray does of course not depend on it.)

>This is not true. A given Bison grammar file is platform independent,
>since Bison generates platform independent ANSI C, provided the
>user-supplied C code within its actions is platform independent, just
>as a parser using the POV macros is platform independent provided the
>code generating the internal data structures is platform independent.

You missed my point: Not the Bison grammar file is the problem, I referred to the
platform specific Bisons. However, I understand what you wnated to say.

>Knowing what I do about the Bison algorithm, I very much doubt that a
>handwritten parser written in the style of PARSE.C could in any way
>compare in terms of speed. Read the chapters in the Bison manual about
>the Bison algorithm to see what I mean. (The URL is above.)

Well, it is no good idea to make any assumtions about speed without any experiments
with the actual POV-Ray grammar, I think now. However, please consider memory usage
and the time it requires later as well.   And I did not say that the handwritten
parser has to be in PARSE.C style.

>(Incidentally, I believe the C and C++ language standards are just as
>"open" as the POV-Ray scene language. $18 per electronic copy of the
>C++ language standard--price from www.nssn.org--doesn't sound like the
>price of a closed standard to me. Besides, that explains why AT&T
>isn't the only company who can legally sell a C or C++ compiler.)

Also this is an off-topic, I have to inform you that you are wrong:
Any reason why a standrad document has to be expensive? C++ is a standard, are you
sure you refer to the ISO 14882 standard document?  I am very convinced it is closed,
and the compiler companies (expect MS of course) work ard to implement this standard.
And about the AT&T issue, I quote Stroustrup, The C++ Programmin Language (3rd Ed.),
page 11: "AT&T Bell Laboratories made amajor contribution ... by allowing ... to
share drafts ... and the base document for the ANSI C++ standardization efford. ...
In June 1991 ... C++ became part of an ISO (international) standardization efford ...
A formally approved international C++ standard is expected in 1998" (And it *was*
accepted in 1998!!!!!)

>Haven't you heard of templates, exception handling, and RTTI (Run Time
>Type Identification)? All are *very* recent additions to the C++
>language standard.

No, they are not all recent addtions to C++, you will find the roots of RTTI and
templates in the now *ten* year old Annotated C++ Refernce Manual, and all features
are (nearly) complete since 1995, and three years are a long time in computer
history!   To your "defense" I have to add that most compiler developers are very
slow implementing all these features.

>Granted, the initial conversion of PARSE.C into a Bison grammar file
>will be considerable work. However, it only has to be done once, and
>is more easily modified to include additional language elements than
>the current macro-based system.

Yes, macros are no recommended programming practice today, but the modern
replacements in form of templates are far superior and currently there is no Bison
implementation that can use the C++ standard library (which includes the STL
(Standard Templaet Library)) features. And the STL offers a lot of abstraction, e.g.
the hash table templates or, of course, the very useful containers, no macro can
offer that.
Bison does not fit to well into this and I cannot find a switch for the Bison I have
used which allows C++ code output, but of course this can change in the future! And
until then the C code is still compatible with C++.

>I fail to understand how the work involved in modifying a Bison
>grammar description file is greater than that involved in modifying
>PARSE.C.

You need to parse the grammar and only then you can parse the C code, and most modern
IDEs need plug-in Bisons to to this automatically - there is no old fashioned (but
very flexible) makefile!

>Besides, your position is like arguing that AT&T should never have
>abandoned Lparse in favor of a direct C++ to assembler implementation
>because Lparse was perfectly usable, even though a direct
>implementation had significant advantages. (Yes, C++ was originally
>implemented with a C preprocessor.)

I know, but I don't see why my argument is comparable to the C++ by preprocessor vs.
direct C++ issue which has been resolved over ten years ago as well, as far as I
know. 

Conclusion: I think Bison is a very interesting, well implemented and useful tool,
but my position is that it does not fit the POV-Ray needs as well as a handwritten
parser implementation could (not limited to the current one). Your position is
different and thats OK for me, so if you find the time to write a POV-Ray grammar and
implement a sample parser (you don't need to support all objects) and can show me
that
I am wrong, then this would be great!

     Thorsten

PS: Some of my points are Macintosh related simply because I program on the Mac most
of the time and it is the system I know best.

Post a reply to this message

From: Ronald L Parker
Subject: Re: Why not generate parser with Bison & Flex?
Date: 4 Jan 1999 21:50:25
Message: <36916860.164025814@news.povray.org>

On Mon, 04 Jan 1999 22:54:42 +0100, "Thorsten Froehlich"
<fro### [at] charliecnsiitedu> wrote:

>Yes, you are right that the parsing part does not build anything, but please pay
>attention to this little detail:
>Look at the lines you did not mark, what do they do?  They reduce the memory usage!
...
>All these lines of code would have to be written inside the grammar file, wouldn't
>they? And this would make the grammar file a total nightmare!!!  Or how would you
>solve this problem, maybe I am totally wrong!?!

You just call a function that contains the elided bits of code using
the parameters you've parsed out.  It still happens at parse time. 
I'm not too quick on the Bison grammar specification language myself,
but I believe you would just pass the parsed vector into the
"do_scale" function, defined somewhere else, and be on your way.  Yes,
you might have to store the vector somewhere temporarily, but you will
only need to store one such vector for a scale command, and it can
even be in a global variable.

So part of your grammar looks something like this (forgive me if I've
completely botched the Bison syntax; it's been a while since I looked
at it)

OBJECT: OBJECT_TOKEN '{' OBJECT '}' |
        SPHERE |
        CYLINDER |
        #All kinds of other objects
        {$$ = CurObjStackPop();}
        ;
SPHERE: SPHERE_TOKEN '{' VECTOR ',' VECTOR {
          CurObjStackPush( CreateSphere( $3, $5 ) );
        }
        OBJECT_MODS '}'
        ;
OBJECT_MODS: SCALE_STMT |
             ROTATE_STMT
             #whatever else is in object_mods.
             ;
SCALE_STMT: SCALE_TOKEN VECTOR {
              Scale( CurObjStackTop(), $2 ); 
            }
            ;

>Yes, the POV-Ray scene language contains not only data structures but also commands
>(I (still) can't find a better word, it is what you called "do") how to modify this
>data. This is a major difference to the common programming languages which allow easy
>parsing from the top to the bottom, while POV-Ray is self modifying code of some kind
>(or some kind of hidden preprocessor): If you parse C you can parse tokens like a
>stream, nothing (OK, this is oversimplified) you parse later in the stream of tokens
>will require you to *change* data you created earlier.

Nothing in POV changes data created earlier, either, to my knowledge,
except to make fairly localized modifications to the object currently
being parsed.  Some things change global variables; that's simple
enough.  Some things change the parameters of the current object;
that's simple enough too.  Some things look up things that were
defined before in a symbol table and make copies of them into a local
object for further modification.

>* In Bison parser code (or better, in the grammar file) you intended to keep the data
>modifying code out! Now you have to *store* the commands (or actions) that change the
>data. You will then go through all this data later, lookup each object, check if
>there are scale commands for it and execute them. Then you free the commands data
>structures.

No, you just put the data modifying code in function calls, 
implemented in a separate module from the grammar.  Each rule in the 
grammar calls a function in the 'separate module'.  The only
difference is, the code that tells you what happens when a particular
command is found are not polluted with the elements of the grammar,
and the grammar isn't polluted with the code that tells you what it
does when it gets a particular sequence of symbols.

>>You don't extend the Bison generated code; you extend the original
>>grammar definition file. (I will grant you that one would not want to
>>manually generate a parser from a Bison grammar definition file. But
>>then, that's what Bison is for!)
>
>You did not get my whole point: Even a small change to the grammar and the whole
>parser has to be recompiled, I don't want to build a parser from the grammar file,
>just extend the parser code by hand.

Just a small change to the parser and the whole parser has to be 
recompiled now, too.  The only difference is an extra pass in the 
compilation.  Microsoft Visual C++ gives you the ability to specify
additional processors (e.g. a processor that can turn a .y file into a
.c file); I have to assume that any modern IDE allows that.  If not, a
makefile surely does.  If your platform supports neither, perhaps it's

time to get a real OS (no Mac-bashing intended, as I'm sure some 
Macintosh compiler supports at least the IDE method)

>This would require someone in the team to always generate one parser for everybody in
>the team, the only way to make sure the bugs are out everywhere.  If this would not
>be done, each platform developers would also have to make sure that on their platform
>the *whole* parser works as expected - and at least on the Mac we have other things
>to do... :-)

Or, it requires the team to all use the same version of Bison.  Even 
if there isn't a recent port of Bison for the Macintosh, someone could

probably build one fairly easily, especially since you say you can 
work around the command-line limitations.

>(Command line options are another problem: E.g. Macs don't have a command line (but
>of course I can work around this) and getting the same options would still be
>difficult on different platforms.)

Not if someone says "these are the options we will use."  I assume 
everyone on the Team has a way to make a script that enshrines those 
options for all eternity, and maybe even a way to make that script 
part of the official source for their platform.

>>(Incidentally, I believe the C and C++ language standards are just as
>>"open" as the POV-Ray scene language. 
>Also this is an off-topic, I have to inform you that you are wrong:
>Any reason why a standrad document has to be expensive? C++ is a standard, are you
>sure you refer to the ISO 14882 standard document?  I am very convinced it is closed,

You're working from different definitions of open vs. closed.  For the

record, Thorsten meant to say that the C++ standard is fairly fixed, 
not that it was proprietary.  This is in fact true, now.  ANSI C has 
been somewhat fixed for far longer.  On the other hand, neither C nor
C++ is completely reducible to a simple Bison grammar, at least in 
part because of typedefs.  The POV language will have the same 
problem, with #declare and #macro.  Parsing #while will probably be 
tricky, as well.

>Bison does not fit to well into this and I cannot find a switch for the Bison I have
>used which allows C++ code output, but of course this can change in the future! And
>until then the C code is still compatible with C++.

There are other parser generators.  For example, there are BYACC and 
PCCTS, both of which are available in source form.  I'm fairly certain

that if Bison isn't, at least one of them is capable of generating 
parsers in C++.  Some versions of BYACC can even do it in Perl.  Not 
that we currently need either C++ or Perl.  

>>I fail to understand how the work involved in modifying a Bison
>>grammar description file is greater than that involved in modifying
>>PARSE.C.
>
>You need to parse the grammar and only then you can parse the C code, and most modern
>IDEs need plug-in Bisons to to this automatically - there is no old fashioned (but
>very flexible) makefile!

Visual C++ 6.0 has a makefile, but even if it didn't, it's fairly 
simple to add command-line utilities like Bison (possibly surrounded 
by a batch file of some kind) to process files.  Again, since MS is 
rarely state-of-the-art, I assume other vendors have had this for 
years.

>Conclusion: I think Bison is a very interesting, well implemented and useful tool,
>but my position is that it does not fit the POV-Ray needs as well as a handwritten
>parser implementation could (not limited to the current one). Your position is
>different and thats OK for me, so if you find the time to write a POV-Ray grammar and
>implement a sample parser (you don't need to support all objects) and can show me
>that
>I am wrong, then this would be great!

I think that's a worthwhile project, particularly if it's written in
an easily extensible way, can be used as a drop-in replacement for
parse.c/tokenize.c, and is free for use both by the POV-Team and
other, possibly commercial, interests who need a parser.

It's worth noting that there is at least one general-purpose
POV-compatible parser using PCCTS: it's called libparpov, and is part
of the POV-to-RIB converter, which is both quite sophisticated and
quite complete, up to version 3.0.  See
http://www9.informatik.uni-erlangen.de/~cnvogelg/pov2rib/
for more information and for source code.

>PS: Some of my points are Macintosh related simply because I program on the Mac most
>of the time and it is the system I know best.

'sallright.  Some of mine are Windows related for the same reasons.

Post a reply to this message

Goto Latest 10 Messages

Next 4 Messages >>>