|
|
|
|
|
|
| |
| |
|
|
From: William F Pokorny
Subject: jr's large csv and dictionary segfault in the povr fork.
Date: 26 Mar 2023 02:46:56
Message: <641fea60$1@news.povray.org>
|
|
|
| |
| |
|
|
A status update.
---
First, let me put on the table the segfault which happens here should
not be a segfault during a non-debug run! Someone had already added code
comments in the cylinder and code code to this effect.
It's good fortune - of sorts - we get the segfaults in povr, but it's
only happening because no POV-Ray source coder got around to changing
that throw into a warning or possible error message. ;-)
---
I've had no luck finding the core problem.
The segfault comes and goes depending seemingly on a lot of things -
which makes me wonder if it isn't sitting there in official POV-Ray
releases too and we've just not tripped it.
The really odd fails seen, I've been unable to reproduce even once. At
the moment I'm chalking those up to my fatigue and potential
configuration and set up issues as I tried to run through just parts of
the animation.
---
I have now a changed foreach.inc file where I've added checks for
duplicate macro to be executed strings and duplicate i_ indexes which
has been working for me without fail for a while!
I suspect it's because I've slowed an whole animation down to almost
exactly the time it take p380 beta 2 to do it - and I've never seen a
fail there either.
The code changed is now:
#macro fore_voidRun(a_)
#for (i_, 0, dict_.ttl_ - 1)
// After this in, not seen a fail myself.
#ifndef (Last_i_)
#declare Last_i_ = i_;
#else
#if (Last_i_=i_)
#local DummyID = f_boom(9,8,7,6,5,4);
#end
#declare Last_i_ = i_;
#end
#if (i_)
fore_cmdNext(dict_)
#end
#local cmd_ = fore_cmdStr();
// Below code found a duplicate once - and only once.
#ifndef (LastCmdStr)
#declare LastCmdStr=cmd_
#else
#if (strcmp(LastCmdStr,cmd_)=0)
#local DummyID = f_boom(1,2,3,4,5,6);
#end
#declare LastCmdStr=cmd_
#end
fore_exec(cmd_,"parse_fore_void.tmp")
#if (fore_debug)
#debug concat("called '",cmd_,"'.\n")
#end
#end
#end
So it feels like a race issue of some kind, but I cannot see how it can
even come about... Each frame is indeed a new parser thread, but it
should be nothing from the prior frame's thread carries forward. I
thought for a while it was file I/O issue, but one fail on the macro
string checking suggests it does have something to do with the
dictionaries or the dictionary set up as you, jr, suspected.
---
On the animation itself. We are basically re-building the scene for each
frame of the animation with early frames having very few objects and the
last ones almost 9000. A curiosity I don't understand is the
usaf__mkItems(4130,a_[4130],tmp_) package indexing goes well to values
over 10000 on occasion. These often peak at those levels and then are
not seen again - maybe these all come from the partial runs where the
animation re-starts? Done so many of these things my head is mush on
what's what.
Anyway, the who code set up, gets slower and slower as we get into the
later frames. Some is because the parsing just gets big/long, but it's
also because the dictionary gets slower. The dictionary is built on top
of the parser symbol table mechanism and, while I've not dug to be
certain, my bet is some of the chains hanging off the hash table entries
are getting long and slow.
I don't know enough about what is changing in the animation to suggest
it, but, if there are larger fixed portions on which we are building and
building, it would almost certainly be faster to keep those fixed parts
as larger include files we bring in all at once - as raw SDL.
Suppose the suggestion neither here or there with respect to the problem
- which is certainly real.
I'll keep playing with it as I have time - and I feel like it - but
given that even starting at say frame 3000 isn't saving me all that much
time for each animation turn - this is a very painful bug to chase.
My plan at the moment is to kick off the full hour plus long animation
(tiny image sizes) with the added delay in place - as I think to do it.
At days end perhaps.
It's not failed in a long while for me, but it's slow even at tiny image
sizes so I haven't really run that many full passes.
If it doesn't fail again for me over time - the delay / race condition
idea perhaps holds enough water for me to start putting in random locks
as I can maybe see places to insert them. The trouble has been that I
perturb things and the the problem blinks in and out for I do not know
what cause - if any really my doing. I cannot come up with a solid way
to come at this problem! It's too flaky - and it takes a painful amount
of time to try anything.
Yes, I'm whining now and should stop. :-)
Bill P.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
hi,
will (have to) re-read the whole post later.
William F Pokorny <ano### [at] anonymousorg> wrote:
> ...
> The segfault comes and goes ...
> The really odd fails seen, ...
> thought for a while it was file I/O issue, but one fail on the macro
> string checking suggests it does have something to do with the
> dictionaries or the dictionary set up as you, jr, suspected.
wondering whether it's possible we're "out-running" the (C++) garbage collector,
and when the macro is called, it then sees "inappropriate" memory on occasion?
regards, jr.
Post a reply to this message
|
|
| |
| |
|
|
From: William F Pokorny
Subject: Re: jr's large csv and dictionary segfault in the povr fork.
Date: 26 Mar 2023 09:47:08
Message: <64204cdc$1@news.povray.org>
|
|
|
| |
| |
|
|
On 3/26/23 05:55, jr wrote:
> wondering whether it's possible we're "out-running" the (C++) garbage collector,
> and when the macro is called, it then sees "inappropriate" memory on occasion?
There isn't a garbage collector in C++/C. Memory is allocated when
needed and released when no longer needed.
Your question thought still a good one. It takes considerable time to
walk through all the allocations and free them more or less unwinding
all the allocations(a). It should be nothing new parser related proceeds
until the memory is freed, but...
Today the memory free up and re-allocation happens in a big way when we
move frame to frame in an animation because one parsing thread goes
away(a) and another gets created for the next frame. Parsing itself is
always single threaded, unlike most other parts of POV-Ray, so we should
not see multi-threading issues per-se.
What I too suspect is that we are perhaps sometimes seeing not quite (or
perhaps in-correctly) initialized new parser memory that still contains
data from the previous parser thread. This could explain why once we see
fail points, they sometimes repeat that fail signature for a while.
Aside: I've gotten another two complete povr animation passes through
with those changes to foreach.inc. Magic, but still real magic! FWIW. :-)
Bill P.
(a) - Back in my working years we were using a large, internally
developed, interactive tool. On it's conversion to C++ we got frustrated
because it took forever to exit the application as the memory was
painstakingly released bit by bit. The developers solved the problem by
intentionally crashing out of the application and letting the OS clean
up the process related memory! ;-)
Anyhow. There is a performance cost to maintaining a, sort of, minimum
memory foot print over time (as there is too for garbage collection
memory management when it kicks in). I've wondered how much time we are
burning doing memory management alone. Plus C++, because it tends to
allocate as needed, ends up with bits and pieces of things all over the
place in physical memory where it would be much better for performance
if related memory were allocated (or re-allocated) in big contiguous
blocks. Newer C++ versions have features intended to help with this
memory fragmentation issue. Ah, whatever I guess. All still well down on
my todo / toplaywith list.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
hi,
William F Pokorny <ano### [at] anonymousorg> wrote:
> ...
> Aside: I've gotten another two complete povr animation passes through
> with those changes to foreach.inc. Magic, but still real magic! FWIW. :-)
quick <blush> in public.. :-)
regards, jr.
Post a reply to this message
|
|
| |
| |
|
|
From: William F Pokorny
Subject: Re: jr's large csv and dictionary segfault in the povr fork.
Date: 4 Oct 2023 06:50:02
Message: <651d435a$1@news.povray.org>
|
|
|
| |
| |
|
|
On 3/26/23 02:46, William F Pokorny wrote:
> A status update.
FWIW. Another status update ahead of some time away.
---
As I was running test cases ahead of another povr tarball release, I
started getting a parse error from inside the HF_Torus() macro on a test
case running cleanly for years.
The failing behavior, and tendency to run cleanly again on most any SDL
change slowing down the parsing is similar to the issue in this thread.
The new bit I see is an existing, macro local, identifier changing type
from a 3D vector to a float - while the internal macro loops run. This,
of course, shouldn't happen and I still don't know why it is.
There's a good deal of expression parsing going on in the loops -
especially '.' vector element accesses. Most of that parsing being in
the parser code, but some of it is happening in the VM too due a
function call. Thinking aloud...
Bill P.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
hi,
William F Pokorny <ano### [at] anonymousorg> wrote:
> On 3/26/23 02:46, William F Pokorny wrote:
> > A status update.
> ...
> The new bit I see is an existing, macro local, identifier changing type
> from a 3D vector to a float - while the internal macro loops run. This,
> of course, shouldn't happen and I still don't know why it is.
still, (much) more information than before. nice.
> There's a good deal of expression parsing going on in the loops -
> especially '.' vector element accesses. Most of that parsing being in
> the parser code, but some of it is happening in the VM too due a
> function call. Thinking aloud...
out of interest, do macros and their respective local storage form units
("objects"), or are they married up "on demand" ?
regards, jr.
Post a reply to this message
|
|
| |
| |
|
|
From: William F Pokorny
Subject: Re: jr's large csv and dictionary segfault in the povr fork.
Date: 4 Oct 2023 14:12:02
Message: <651daaf2$1@news.povray.org>
|
|
|
| |
| |
|
|
On 10/4/23 11:36, jr wrote:
> out of interest, do macros and their respective local storage form units
> ("objects"), or are they married up "on demand" ?
Suppose more the latter. There isn't really local macro storage / or a
local macro stack (excepting where VM functions are used with macros).
My current understanding; Hopefully not too badly described.
There is a, local to each macro when running, symbol table for #local
declared things (*,**) and parameters (always true..?). The table
entries point to created / stored things which might or might not
persist beyond the macro call depending on whether they are assigned to
an identifier in a calling level of hierarchy(b).
---
Function calls, whether inside or outside macros, are different.
For inbuilt functions like f_sphere() there is a virtual machine (VM)
stack for passed and returned variables for each function call - and
another stack used by the compiler for C++ variables within the inbuilt
code.
For user (parse time compiled functions) run on the VM there is just the
(VM) stack (lies and more lies... I know) from the SDL user's perspective.
Bill P.
(*) - Something I noticed on starting this recent debugging and that
I've fixed in my povr copies of the HF* macros! These old HF* macros
switch from using #local to using #declare for some variables near the
bottom of each macro for reasons unknown...
...
- #declare PArr[J][K] = P + H*Dir*Depth;
+ #local PArr[J][K] = P + H*Dir*Depth;
- #declare K = K+1;
+ #local K = K+1;
#end
- #declare J = J+1;
+ #local J = J+1;
#end
HFCreate_()
...
This makes for confusing code, but it doesn't break things because the
identifier is seen as already defined locally... In other words, those
#declares don't create new 'global' identifiers, but rather, they
redefine the locally defined identifier.
What this means more generally is we cannot arbitrarily create an
identifier in the global name space with #declare where it has first
been defined (added to the symbol table) with #local in the local macro
space.
(**) - Something foggy still for me. I 'believe' it is still true that
#local definitions sitting unwrapped by a macro in the top level scene
file act as global #declares, but, I've not tested this with the new
local and global dictionary access qualifiers in v3.8. Maybe at the top
scene file the local and global dictionaries become the same thing?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
William F Pokorny <ano### [at] anonymousorg> wrote:
> On 10/4/23 11:36, jr wrote:
> > out of interest, do macros and their respective local storage form units
> Suppose more the latter. There isn't really local macro storage / or a
> local macro stack (excepting where VM functions are used with macros).
hey, thanks.
makes one speculate that sometimes then perpaps it's something "silly", like
some pointer not updated (or incorrectly).
> ...
> identifier is seen as already defined locally... In other words, those
> #declares don't create new 'global' identifiers, but rather, they
> redefine the locally defined identifier.
> What this means more generally is we cannot arbitrarily create an
> identifier in the global name space with #declare where it has first
> been defined (added to the symbol table) with #local in the local macro
> space.
interesting, thanks. (I get much good info from your "musings" :-))
regards, jr.
Post a reply to this message
|
|
| |
| |
|
|
From: William F Pokorny
Subject: Re: jr's large csv and dictionary segfault in the povr fork.
Date: 7 Oct 2023 08:22:55
Message: <65214d9f$1@news.povray.org>
|
|
|
| |
| |
|
|
On 10/4/23 06:50, William F Pokorny wrote:
> The new bit I see is an existing, macro local, identifier changing type
> from a 3D vector to a float - while the internal macro loops run. This,
> of course, shouldn't happen and I still don't know why it is.
OK. I finally ran this bug down!
For any POV-Ray fork, or official release, using clipka's newer
parser(*), change the line
int Temp_Count=3000000;
to
POV_LONG Temp_Count=9223372036854775807-1;
in the file parser.cpp and within Parser::Parse_RValue(...).
--- More detail for those interested.
There is code which counts the delta in the number of tokens found after
seeing callable Identifiers / macro parameters. It involves the variable
Tmp_Count. A variable I suspect was long ago initialized to a value
thought to be many times larger than probable configuration values for
TOKEN_OVERFLOW_RESET_COUNT.
The original coders made use of the token counter used for periodic
parser status messages, rather than the global token counter, when
calculating the delta in tokens parsed. This approach a little ugly in
that it requires extra code for handling the reset/wrap cases which will
frequently happen when TOKEN_OVERFLOW_RESET_COUNT is relatively small.
Christoph, while otherwise nicely re-factoring / cleaning up the older
parser code, switched to using the global token counter straight up.
This global counter has values which often run over/past the default
Tmp_Count initialization of three million.
So... Once in a thousand blue moons, we call the
Parser::Parse_RValue(...) code when the global token count is exactly
three million and the delta in tokens found happens to be one.
On stepping on that landmine, code which should not run, does. This
almost always results in an update to the wrong identifier type and a
corruption of identifier associated data too.
A hideous bug. It's likely we didn't always know we'd tripped it. Only
when the parser core dumped or stopped on some parser error was it clear
something had gone wrong. I expect we all too often got weird parsing
behavior or an odd image result instead. On thinking it something we
did, we'd twiddle with the SDL. With the updated SDL and the bug would
go away - or worse perhaps moves to another identifier with different
end effects.
Further, where the parser didn't stop, the issue would often self heal
the next time the identifier was redefined / updated (as in a loop)
because the parser would realize the assigned value was indeed, say a
vector, and not a float or whatever...
Animations - especially ones growing / changing frame to frame - are
more likely to trip this bug simply for having more chances at it.
I was wrong that slowing down the parsing helped with this bug. I was
simply changing the SDL parser's token counts enough to avoid it.
Bill P.
(*) - The v3.8 beta code backed off to an older v3.7 / v3.8 version of
the parser which still used the token counter used for parser message
updates regarding how many tokens have been parsed thus far. It should
be OK with default builds excepting a couple VERY narrow exposures.
That said, I'd recommend all official code make the update above too!
The v3.8 beta 1&2 code (and I'd bet most official POV-Ray versions...)
are narrowly exposed:
- Should builders twiddle with the configuration variable
TOKEN_OVERFLOW_RESET_COUNT in unlucky ways.
- Should the token delta count align in on an unfortunate harmonic with
the relatively low default value of 2500 for TOKEN_OVERFLOW_RESET_COUNT.
I think this not likely to happen in typical SDL.
- Should the type cast from the master POV_LONG (long long int) token
count to 'int' itself cause a another problem - or still trigger this
one in some round about way.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
|
|