|
|
Am 10.06.2021 um 13:18 schrieb Mr:
> clipka <ano### [at] anonymousorg> wrote:
>
>> Explicit ASCII alternatives, hard-baked into the language, would be a
>> must, IMO. As I mentioned, Unicode symbols would be syntactic sugar. The
>> ASCII constructs would be the real deal, while the Unicode symbols would
>> be considered shortcuts.
>
> Okay, and do you confirm that such kind of things would have significant impact
> on parse time, like: linearly, if you divide the character amounts by two you
> get half parsing code?
No, parser performance is not that simple.
A good parser (which POV-Ray's old one is not by any stretch, and even
the overhauled one is only a step on the way there) will just _scan_ the
whole file once (i.e. identify start and end of each character sequence
that look like a token at first glance - e.g. sequences that look like
numbers, sequences that look like keywords or identifies, sequences that
look like operands, etc.), _tokenize_ it once (i.e. translate those
character sequences into internal numeric IDs, aka tokens), and from
there on just juggle those IDs.
The next steps would be to either...
- walk through those tokens and "execute" them, implementing loops by
processing the corresponding tokens over and over again; in this case
processing the loops again and again would be the bottleneck.
- digest that token sequence even further, "compiling" it into something
that can be executed so efficiently that it might have a chance to
become negligibe compared to the time spent scanning and tokenizing; but
to achieve that, the effort to bring it into this efficient form will
itself outweigh the effort of scanning and tokenizing.
In either case, the genuinely time-consuming portions of parsing will
work on a representation in which the number of characters comprising
the keywords or operands will have become entirely irrelevant.
Post a reply to this message
|
|