POV-Ray : Newsgroups : povray.general : A couple parser performance issues/optimizations. Server Time
2 May 2024 13:20:41 EDT (-0400)
  A couple parser performance issues/optimizations. (Message 11 to 16 of 16)  
<<< Previous 10 Messages Goto Initial 10 Messages
From: Bald Eagle
Subject: Re: A couple parser performance issues/optimizations.
Date: 11 Oct 2017 08:00:00
Message: <web.59de07b831ae2f12c437ac910@news.povray.org>
clipka <ano### [at] anonymousorg> wrote:


> Parsing a loop... well, it /could/ be reasonably easy if the parser
> pre-processed the content somehow (e.g. discarding any comments,
> identifying keywords, and some such), and on each loop iteration only
> traversed the pre-processed stuff. Unfortunately, POV-Ray's organically
> grown syntax is quite detrimental to such an approach.

I wasn't limiting my inquiry to loops - the parsing phase just always seemed to
be an unusually / unexpectedly slow thing.

Now, I understand that there are a lot of things that take place under the hood,
with lots of moving parts, and take place in ways I don't understand.
But it seems to me that the scene text is already "pre-processed" to a certain
degree - since we have real-time context-sensitive color coding.
There's also plenty of time where the bulk of the text just sits there unedited
while the user works on other parts of the scene or scrolls through re-reading
it.   I realize that this text is likely unsaved, prone to editing, etc. and the
editor is not the parser....
And that the parser necessarily (?) is limited to a single core, since the scene
needs to be parsed in a linear manner.
Perhaps something needs to be added to the language that makes multi-core
parsing possible, if the SDL is written properly.
#subsection 1
 ....
#end

#subsection 2
 ....
#end
Just food for thought for future developments.

Then there is the loop - which is how we got here in the first place.
The loop gets parsed the first time - so perhaps there is a way for the [future]
parser to process the loop content on that first pass

> > <Gollum> Clipka hates the parser, yes he does, my Precioussssssss  <Gollum>
>
> Oh yes, we does, doesn't we, my precioussss? Oh yes, we loathes it!
>
> It's not tasty... it's not scrumptious... no, preciousssss! Not at all
> scrumptious! Nassssty parsers!

It burns us!!!


Post a reply to this message

From: clipka
Subject: Re: A couple parser performance issues/optimizations.
Date: 11 Oct 2017 09:58:29
Message: <59de2385$1@news.povray.org>
Am 11.10.2017 um 13:59 schrieb Bald Eagle:

> I wasn't limiting my inquiry to loops - the parsing phase just always seemed to
> be an unusually / unexpectedly slow thing.

In some cases the slow thing isn't even the parsing, but the loading of
texture images.

> Now, I understand that there are a lot of things that take place under the hood,
> with lots of moving parts, and take place in ways I don't understand.
> But it seems to me that the scene text is already "pre-processed" to a certain
> degree - since we have real-time context-sensitive color coding.

That's just the editor, which is completely independent of the actual
parser/renderer pipeline.

> There's also plenty of time where the bulk of the text just sits there unedited
> while the user works on other parts of the scene or scrolls through re-reading
> it.   I realize that this text is likely unsaved, prone to editing, etc. and the
> editor is not the parser....

Running the parser whenever the user has made a change and is now idle,
is problematic for multiple reasons:

- The user could resume editing at any moment; in that case, parsing
would have to be aborted. (Likewise, the user may be editing other files
the scene needs, and the parser won't be able to detect that until it
runs into the corresponding `#include` statement.)

- The current architecture doesn't allow parsing of unsaved files; so
triggering the parser would only be possible once the user saves their
file. More often than not, I guess, this will be when the user pushes
the render button anyway.

- As you're surely well aware, running the parser constitutes quite an
effort (otherwise it wouldn't take that long). Triggering it whenever
the user has made a change and is now idle, would be quite a waste of
computing power (and hence energy). Also, parsing involves a lot of I/O,
which degrades overall performance and responsiveness of the system, and
can't be offset by multicore architectures.


> And that the parser necessarily (?) is limited to a single core, since the scene
> needs to be parsed in a linear manner.
> Perhaps something needs to be added to the language that makes multi-core
> parsing possible, if the SDL is written properly.

I wouldn't want to attempt that. The parser is fragile as it is, and I
wouldn't want to try making it multithreaded.

I'd rather "re-invent" the entire SDL and write a new parser based on
LLVM, using its intermediate format to allow for pre-parsing of include
files and utilizing its just-in-time capabilities for fast loop
execution and stuff (also, and maybe even more importantly, for
user-defined functions).

> Then there is the loop - which is how we got here in the first place.
> The loop gets parsed the first time - so perhaps there is a way for the [future]
> parser to process the loop content on that first pass

That approach is bound to fail at least for certain cases, as POV-Ray's
SDL allows to write self-modifying code (using `#write` and `#include`),
so each loop iteration may need re-parsing.

And again I wouldn't want to go through all the hassle of implementing
such a pre-parsing feature -- because it would essentially constitute a
partial re-write, and that energy would certainly be better invested
into a brand-new parser (with brand-new syntax).


Post a reply to this message

From: jr
Subject: Re: A couple parser performance issues/optimizations.
Date: 11 Oct 2017 11:41:19
Message: <59de3b9f$1@news.povray.org>
hi,

this is an interesting discussion/insight into the inner workings.

but I'm troubled by

On 11/10/2017 10:44, Kenneth wrote:
> I need to look through my scenes to try and eliminate as many unnecessary
> comments as possible in any #while/#for loops, and in any #macros that are
> called multiple times.

IMO no programmer should have to degrade the documentation to gain
performance.  to spare Kenneth (+others) the editing, I've written a
small utility which will strip comments from a given scene/include file
and output a new file named exactly as its input but prefixed 'nc_', ie

  $ nocomment myscene.pov

outputs 'nc_myscene.pov'.

I've run it against a bunch of distribution scene files and has no
problems.  the code (with Makefile) is posted in p.binaries.misc under
"nocomment".

regards, jr.


Post a reply to this message

From: Alain
Subject: Re: A couple parser performance issues/optimizations.
Date: 11 Oct 2017 18:54:01
Message: <59dea109$1@news.povray.org>
Le 17-10-11 à 05:44, Kenneth a écrit :
> Alain <kua### [at] videotronca> wrote:
>> Le 17-10-07 à 19:00, Bald Eagle a écrit :
>>> William F Pokorny <ano### [at] anonymousorg> wrote:
>>>>
>>>> 1) Comments in inner loops are costly.
>>>
>>> I would not have guessed that.  I had always imagined that it bailed out
>>> as soon as it saw  //  but I guess not.
>>
>> It does ignore whatever is in the comment, but, it have to pace through
>> the comment character by character to find exactly where it ends. In a
>> loop, it add up quickly.
>>
> 
> Thanks, Alain. That explains my own longer parse times when using lots of
> comments in a scene, the reason for which has been a mystery to me for years--
> because the documentation makes a point of saying "comments are ignored by the
> raytracer" and "Use comments liberally and generously." ;-)
> 
> My own (false) understanding was that a  // or  /* comment indicator somehow
> caused the parser to actually *skip* all of the following text, or until it hit
> another 'closing' */ indicator-- kind of like a  'goto'  statement in other
> languages, or like POV's macro-invocation behavior, where the parser immediately
> jumps to the macro itself. But it makes sense that the parser has to 'read' the
> intervening text-- at least when using  /*  and  */ --in order to know WHERE the
> closing comment is! For single-line  // comments, though, I would have imagined
> that all the text on that particular line would BE ignored, as soon as the
> parser came across a double slash //  Hmm, apparently not.
> 
> I need to look through my scenes to try and eliminate as many unnecessary
> comments as possible in any #while/#for loops, and in any #macros that are
> called multiple times.
> 
> 

With some interpreted languages with some kind of bitcompiling, I 
remember Applesoft BASIC, the programms are in a linked list format, 
where each line have a header containing a pointer to the next line. In 
that case, when you have a comment, it's easy to jump to the next line 
as you aleready know where it start.

When the code is in text form, without any pointer to the next part of 
the code, you can't jump forward and are forced to actually read the 
comments to find where they end.


Post a reply to this message

From: dick balaska
Subject: Re: A couple parser performance issues/optimizations.
Date: 13 Oct 2017 03:17:06
Message: <59e06872$1@news.povray.org>
On 10/11/2017 09:58 AM, clipka wrote:
> 
> Running the parser whenever the user has made a change and is now idle,
> is problematic for multiple reasons:
> 
> - The user could resume editing at any moment; in that case, parsing
> would have to be aborted. (Likewise, the user may be editing other files
> the scene needs, and the parser won't be able to detect that until it
> runs into the corresponding `#include` statement.)
> 
> - The current architecture doesn't allow parsing of unsaved files; so
> triggering the parser would only be possible once the user saves their
> file. More often than not, I guess, this will be when the user pushes
> the render button anyway.
> 

The povclipse2 parser re-parses in-memory/mid-edit, starting at the 
point of editing. (A gift from eclipse that all its language parsers 
share). It's not perfect, like I start with the whole symbol table, 
where, correctly, I should start with the symbol table at the point of 
editing.
This gives me quick feedback of typos for undefined foo and bad bracing. 
Then when user saves the files, I flush the symbol table and start over 
at the beginning.
If the user continues editing during parsing, it just aborts parsing and 
starts again at the "farthest north" unparsed point.

Elsewhere clipka mentioned the organic evolution of SDL as problematic.
It is indeed a unique language.  A 'C' style object definition language 
with a 'BASIC/BASH' style control language bolted on to it. And then the 
next generation bolted the pseudo-preprocessor/function #macro hybrid on 
top of that.
(I understand how it came to be, and I think all the Right Choices were 
made, but man, those semicolons kill me. ;) )

> Also, parsing involves a lot of I/O,
> which degrades overall performance and responsiveness of the system, and
> can't be offset by multicore architectures.

I haven't been in the povray parser much, but my understanding was there 
were copious amounts of fget/ftell/fseek, which slows things down.


-- 
dik


Post a reply to this message

From: clipka
Subject: Re: A couple parser performance issues/optimizations.
Date: 13 Oct 2017 05:38:22
Message: <59e0898e@news.povray.org>
Am 13.10.2017 um 09:17 schrieb dick balaska:
> On 10/11/2017 09:58 AM, clipka wrote:
>>
>> Running the parser whenever the user has made a change and is now idle,
>> is problematic for multiple reasons:
>>
>> - The user could resume editing at any moment; in that case, parsing
>> would have to be aborted. (Likewise, the user may be editing other files
>> the scene needs, and the parser won't be able to detect that until it
>> runs into the corresponding `#include` statement.)
>>
>> - The current architecture doesn't allow parsing of unsaved files; so
>> triggering the parser would only be possible once the user saves their
>> file. More often than not, I guess, this will be when the user pushes
>> the render button anyway.
>>
> 
> The povclipse2 parser re-parses in-memory/mid-edit, starting at the
> point of editing. (A gift from eclipse that all its language parsers
> share). It's not perfect, like I start with the whole symbol table,
> where, correctly, I should start with the symbol table at the point of
> editing.
> This gives me quick feedback of typos for undefined foo and bad bracing.
> Then when user saves the files, I flush the symbol table and start over
> at the beginning.
> If the user continues editing during parsing, it just aborts parsing and
> starts again at the "farthest north" unparsed point.

The difference there is that those languages are compiled, i.e. the
parser scans each portion of the file just once (or maybe twice or
thrice at worst), whether it is a straight piece of code or a loop.

It is only later that the code is actually executed.

That doesn't work with POV-Ray's SDL. For any piece of software to
"understand" the code, it needs to /interpret/ it, i.e. actually run the
code.


>> Also, parsing involves a lot of I/O,
>> which degrades overall performance and responsiveness of the system, and
>> can't be offset by multicore architectures.
> 
> I haven't been in the povray parser much, but my understanding was there
> were copious amounts of fget/ftell/fseek, which slows things down.

For small pieces of code, a way around this in the newest versions of
POV-Ray is to move that code into a macro of at most 64 kiB in size.
Such macros will now be cached in memory.


Post a reply to this message

<<< Previous 10 Messages Goto Initial 10 Messages

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.