POV-Ray: Newsgroups: povray.off-topic: Very long post: Re: Very long post

POV-Ray : Newsgroups : povray.off-topic : Very long post : Re: Very long post		Server Time 30 Jun 2025 20:37:51 EDT (-0400)
From: Warp
Date: 1 Oct 2008 12:35:00
Message: <48e3a6b4@news.povray.org>
Invisible <voi### [at] devnull> wrote:
> OK, well since you seem to know something about this... using the 
> formalisms developed here, how would you go about parsing a grammar that 
> can consist of integers, variable names, and + - / * ( ), such that the 
> operators have the correct precidence?

  I don't know if this has anything to do with the formalism in question,
but I used the following classical parsing algorithm in my function parser
library, which converts a function string into a series of bytecode commands
which use stack arithmetic. (The parser is able to parse the input with one
single pass, by only seeing one token at a time).

  Basically you create one parsing function for each level of precedence.
You start from the lowest precedence and make each such function call the
function at the next precedence level. The highest precedence level
function is the function that parses one element in the expression.

  Assuming that + and - have the same precedence and are parsed
left-to-right, you write the lowest precedence function like this
(C++'ish pseudocode):

parsePlusMinus(functionString)
{
    parseMulDiv(functionString); // The next higher precedence function.

    token = functionString.nextToken(); // Could be end-of-string
    if(token == '+' or token == '-')
    {
        parseMulDiv(functionString);
        if(token == '+') generateCommand(add);
        else generateCommand(sub);
    }
}

  Basically this divides the function string into parts which are separated
with '+' or '-'. It parses these parts by calling the next higher precedence
function, and then generates the operator command which calculates the sum
or subtraction of those parts. (Since my parser uses stack arithmetic, it
means that these "parts" are values on top of the stack when the
parseMulDiv() function has done its job and returns.)

  The next function is very similar:

parseMulDiv(functionString)
{
    parseElement(functionString); // The next higher precedence function.

    token = functionString.nextToken();
    if(token == '*' or token == '/')
    {
        parseElement(functionString);
        if(token == '*') generateCommand(mul);
        else generateCommand(div);
    }
}

  Now in your syntax the next higher precedence function is the one
which parses a single element. But what about parentheses? Well, here's
the trick: An expression in parentheses *is* a single element: After its
contents have been parsed it generates one single value onto the stack,
exactly as a single element does. The trick is to parse its contents
recursively, as its own expression. That is, something like:

parseElement(functionString)
{
    token = functionString.nextToken();

    if(token == '(')
        parsePlusMinus(functionString); // parse the inner expression
        // When it returns, the ')' will have been read out already

    else if(token == NUMBER)
        generateCommand(push, NUMBER); // push the number onto the stack

    else SyntaxError;
}

  Of course this is highly simplified. For example, it doesn't perform any
syntax correctness checking (eg. that parentheses match). Those require a
bit more logic inside those functions. But you get the basic idea.

  The beauty of this is that with these recursive functions you are able
to parse the function string by reading one token at a time, traversing
the string only once from beginning to end, without ever needing to go
back or perform any look-ahead, yet it still correctly handles operator
precedence and nested expressions in parentheses.

-- 
                                                          - Warp
Post a reply to this message