POV-Ray : Newsgroups : povray.unofficial.patches : jr's large csv and dictionary segfault in the povr fork. Server Time
23 Sep 2023 13:11:35 EDT (-0400)
  jr's large csv and dictionary segfault in the povr fork. (Message 1 to 4 of 4)  
From: William F Pokorny
Subject: jr's large csv and dictionary segfault in the povr fork.
Date: 26 Mar 2023 02:46:56
Message: <641fea60$1@news.povray.org>
A status update.

---
First, let me put on the table the segfault which happens here should 
not be a segfault during a non-debug run! Someone had already added code 
comments in the cylinder and code code to this effect.

It's good fortune - of sorts - we get the segfaults in povr, but it's 
only happening because no POV-Ray source coder got around to changing 
that throw into a warning or possible error message. ;-)

---
I've had no luck finding the core problem.

The segfault comes and goes depending seemingly on a lot of things - 
which makes me wonder if it isn't sitting there in official POV-Ray 
releases too and we've just not tripped it.

The really odd fails seen, I've been unable to reproduce even once. At 
the moment I'm chalking those up to my fatigue and potential 
configuration and set up issues as I tried to run through just parts of 
the animation.

---
I have now a changed foreach.inc file where I've added checks for 
duplicate macro to be executed strings and duplicate i_ indexes which 
has been working for me without fail for a while!

I suspect it's because I've slowed an whole animation down to almost 
exactly the time it take p380 beta 2 to do it - and I've never seen a 
fail there either.

The code changed is now:

#macro fore_voidRun(a_)
   #for (i_, 0, dict_.ttl_ - 1)

     // After this in, not seen a fail myself.
     #ifndef (Last_i_)
         #declare Last_i_ = i_;
     #else
         #if (Last_i_=i_)
             #local DummyID = f_boom(9,8,7,6,5,4);
         #end
         #declare Last_i_ = i_;
     #end

     #if (i_)
       fore_cmdNext(dict_)
     #end
     #local cmd_ = fore_cmdStr();

     // Below code found a duplicate once - and only once.
     #ifndef (LastCmdStr)
         #declare LastCmdStr=cmd_
     #else
         #if (strcmp(LastCmdStr,cmd_)=0)
             #local DummyID = f_boom(1,2,3,4,5,6);
         #end
         #declare LastCmdStr=cmd_
     #end

     fore_exec(cmd_,"parse_fore_void.tmp")
     #if (fore_debug)
       #debug concat("called '",cmd_,"'.\n")
     #end
   #end
#end

So it feels like a race issue of some kind, but I cannot see how it can 
even come about... Each frame is indeed a new parser thread, but it 
should be nothing from the prior frame's thread carries forward. I 
thought for a while it was file I/O issue, but one fail on the macro 
string checking suggests it does have something to do with the 
dictionaries or the dictionary set up as you, jr, suspected.

---
On the animation itself. We are basically re-building the scene for each 
frame of the animation with early frames having very few objects and the 
last ones almost 9000. A curiosity I don't understand is the 
usaf__mkItems(4130,a_[4130],tmp_) package indexing goes well to values 
over 10000 on occasion. These often peak at those levels and then are 
not seen again - maybe these all come from the partial runs where the 
animation re-starts? Done so many of these things my head is mush on 
what's what.

Anyway, the who code set up, gets slower and slower as we get into the 
later frames. Some is because the parsing just gets big/long, but it's 
also because the dictionary gets slower. The dictionary is built on top 
of the parser symbol table mechanism and, while I've not dug to be 
certain, my bet is some of the chains hanging off the hash table entries 
are getting long and slow.

I don't know enough about what is changing in the animation to suggest 
it, but, if there are larger fixed portions on which we are building and 
building, it would almost certainly be faster to keep those fixed parts 
as larger include files we bring in all at once - as raw SDL.

Suppose the suggestion neither here or there with respect to the problem 
- which is certainly real.

I'll keep playing with it as I have time - and I feel like it - but 
given that even starting at say frame 3000 isn't saving me all that much 
time for each animation turn - this is a very painful bug to chase.

My plan at the moment is to kick off the full hour plus long animation 
(tiny image sizes) with the added delay in place - as I think to do it. 
At days end perhaps.

It's not failed in a long while for me, but it's slow even at tiny image 
sizes so I haven't really run that many full passes.

If it doesn't fail again for me over time - the delay / race condition 
idea perhaps holds enough water for me to start putting in random locks 
as I can maybe see places to insert them. The trouble has been that I 
perturb things and the the problem blinks in and out for I do not know 
what cause - if any really my doing. I cannot come up with a solid way 
to come at this problem! It's too flaky - and it takes a painful amount 
of time to try anything.

Yes, I'm whining now and should stop. :-)

Bill P.


Post a reply to this message

From: jr
Subject: Re: jr's large csv and dictionary segfault in the povr fork.
Date: 26 Mar 2023 06:00:00
Message: <web.6420164299c558654301edef6cde94f1@news.povray.org>
hi,

will (have to) re-read the whole post later.

William F Pokorny <ano### [at] anonymousorg> wrote:
> ...
> The segfault comes and goes ...
> The really odd fails seen, ...
> thought for a while it was file I/O issue, but one fail on the macro
> string checking suggests it does have something to do with the
> dictionaries or the dictionary set up as you, jr, suspected.

wondering whether it's possible we're "out-running" the (C++) garbage collector,
and when the macro is called, it then sees "inappropriate" memory on occasion?


regards, jr.


Post a reply to this message

From: William F Pokorny
Subject: Re: jr's large csv and dictionary segfault in the povr fork.
Date: 26 Mar 2023 09:47:08
Message: <64204cdc$1@news.povray.org>
On 3/26/23 05:55, jr wrote:
> wondering whether it's possible we're "out-running" the (C++) garbage collector,
> and when the macro is called, it then sees "inappropriate" memory on occasion?

There isn't a garbage collector in C++/C. Memory is allocated when 
needed and released when no longer needed.

Your question thought still a good one. It takes considerable time to 
walk through all the allocations and free them more or less unwinding 
all the allocations(a). It should be nothing new parser related proceeds 
until the memory is freed, but...

Today the memory free up and re-allocation happens in a big way when we 
move frame to frame in an animation because one parsing thread goes 
away(a) and another gets created for the next frame. Parsing itself is 
always single threaded, unlike most other parts of POV-Ray, so we should 
not see multi-threading issues per-se.

What I too suspect is that we are perhaps sometimes seeing not quite (or 
perhaps in-correctly) initialized new parser memory that still contains 
data from the previous parser thread. This could explain why once we see 
fail points, they sometimes repeat that fail signature for a while.

Aside: I've gotten another two complete povr animation passes through 
with those changes to foreach.inc. Magic, but still real magic! FWIW. :-)

Bill P.

(a) - Back in my working years we were using a large, internally 
developed, interactive tool. On it's conversion to C++ we got frustrated 
because it took forever to exit the application as the memory was 
painstakingly released bit by bit. The developers solved the problem by 
intentionally crashing out of the application and letting the OS clean 
up the process related memory! ;-)

Anyhow. There is a performance cost to maintaining a, sort of, minimum 
memory foot print over time (as there is too for garbage collection 
memory management when it kicks in). I've wondered how much time we are 
burning doing memory management alone. Plus C++, because it tends to 
allocate as needed, ends up with bits and pieces of things all over the 
place in physical memory where it would be much better for performance 
if related memory were allocated (or re-allocated) in big contiguous 
blocks. Newer C++ versions have features intended to help with this 
memory fragmentation issue. Ah, whatever I guess. All still well down on 
my todo / toplaywith list.


Post a reply to this message

From: jr
Subject: Re: jr's large csv and dictionary segfault in the povr fork.
Date: 26 Mar 2023 13:40:00
Message: <web.6420829499c558654301edef6cde94f1@news.povray.org>
hi,

William F Pokorny <ano### [at] anonymousorg> wrote:
> ...
> Aside: I've gotten another two complete povr animation passes through
> with those changes to foreach.inc. Magic, but still real magic! FWIW. :-)

quick <blush> in public..  :-)


regards, jr.


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.