POV-Ray : Newsgroups : povray.programming : Why not generate parser with Bison & Flex? : Re: Why not generate parser with Bison & Flex? Server Time
29 Jul 2024 02:34:06 EDT (-0400)
  Re: Why not generate parser with Bison & Flex?  
From: Thorsten Froehlich
Date: 30 Dec 1998 15:56:53
Message: <368a9395.0@news.povray.org>
In article <36899335.0@news.povray.org> , "Evan Powers" <ept### [at] aolcom> wrote:

>As an in-my-free-time project I'm writing a structured database system that
>lets remote users log into a server then read and edit the database. I've
>created my own mini language for storing the database, and have discovered
>that I really don't want to write a parser for it. It would probably be kind
>of fun, but hey, I don't have the time. Anyway, I discovered the UNIX
>utilities Bison and Flex, and the man pages tell me these utilities will
>generate a lexer and parser for me from a "source" file containing the
>language grammar. I've read most of Bison's info file, and it looks like
>these utilities would make writing even a parser for the C & C++ programming
>languages easy.

Note that these tools are designed for defined languages, not constantly extended
languages like the POV-Ray scene description language, but before I continue, please
note that I never used Bison and Flex myself, but I have seen code generated by them.
And the parser is only a (minor) part, you still the "compiler" part for the language
- and this goes for POV-Ray as well, see below.

>Soon after, what pops into my head? Something a friend working on his own
>custom version of POV told me... something about weird macros in the
>parser.... Anyway, I downloaded the source and took a look, and discovered
>that, indeed, the parser consisted of cryptic code and lots of funny macros.

Why are these macros funny or cryptic - once you got used to them they really ease
reading the parser code (I think)?

#define EXPECT { int Exit_Flag; Exit_Flag = false; \
 while (!Exit_Flag) {Get_Token();  switch (Token.Token_Id) {
#define CASE(x) case x:
#define CASE2(x, y) case x: case y:
#define CASE3(x, y, z) case x: case y: case z:
#define CASE4(w, x, y, z) case w: case x: case y: case z:
#define CASE5(v, w, x, y, z) case v: case w: case x: case y: case z:
#define CASE6(u, v, w, x, y, z) case u: case v: case w: case x: case y: case z:
#define END_CASE break;
#define EXIT Exit_Flag = true;
#define OTHERWISE default:
#define END_EXPECT } } }
#define GET(x) Get_Token(); if (Token.Token_Id != x) Parse_Error (x);
#define ALLOW(x) Get_Token(); if (Token.Token_Id != x) Unget_Token();
#define UNGET Unget_Token();
#define CASE_FLOAT CASE2 (LEFT_PAREN_TOKEN, FLOAT_FUNCT_TOKEN)\
 CASE2 (PLUS_TOKEN, DASH_TOKEN) UNGET
#define CASE_VECTOR CASE2 (VECTOR_FUNCT_TOKEN,LEFT_ANGLE_TOKEN) \
 CASE2 (U_TOKEN,V_TOKEN) CASE_FLOAT
#define CASE_EXPRESS CASE_VECTOR
#define CASE_COLOUR CASE3 (COLOUR_TOKEN,COLOUR_KEY_TOKEN,COLOUR_ID_TOKEN) UNGET


>The point:
>
>Has anyone considered redoing POV's parser entirely using Bison and Flex?
>Making a language grammar file, from what I know, would be child's play
>compared to understanding and changing the existing code. Plus it would make
>maintaining the language grammar (and thus the parser) ridiculously easy.

Well, making a grammar file does not write the code 'converting' the parser output
into POV-Ray's internal data structures, and as this the part the parser code you see
as difficult to understand does very well and even transparent to some extend.
However, it is true that the POV-Ray parser is not the absolute optimal design, but
it is more flexible and significantly easier to extend than Bison generated code (if
you want to do so by hand...).

>Both programs are part of GNU (they're free with source like POV), and I
>know that ports exist for Win32 and MS-DOS as well as UNIX, so lack of tools
>isn't an excuse. 

Yes, Bison and Flex are (always) available on Unix, there are a lot of different
implementations (and version) on different platforms, and if someone on one platform
wants to extend the language and uses a different version of Bison and Flex ther will
be different code each time someone else generates the code! This would be very
confusing and not work (easy) for a cross platform application *and* development like
POV-Ray.

>It might even speed parse times, since parsers and lexers
>made by these tools supposedly run at several hundred thousand lines per
>minute for even complex languages.

Yes, the code they output is fast, but (very well) handwritten, speciallized parsers
will most likely be much faster even languages like C++, it is just that handwritten
parsers for such languages are not as common because these languages are "closed"
(now), while the POV-Ray scene language is "open" (and will always be).  While a
parser for example for C++ will only have to be generated once, and the compiler code
itself will do the rest, for POV-ray this will result in: Run Flex and Bison,
integrate it with the POV-Ray code, check if everything still works, add the code
supporting the extension to the language is the data setup code, recompile and hope
there will not be the need for a last minute extension like the "material" syntax in
3.1...for the current, handwritten parser you learn to use (or print them out) the
few(!!!) macros and make an extension like the "material" statement with just a few
lines of code to change and *no* further work.

>I'd almost be willing to undertake the task myself, and might eventually if
>no one else does, except I have lots of other, less recreational, things to
>do. It seems like such an elegant solution in terms of ease of maintenance
>and comprehension; I hope the POV development team looks into it.

Well, I think Thomas Baier (in the team now) has done some of this work some time
ago.


     Thorsten


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.