POV-Ray : Newsgroups : povray.beta-test : crash in origin/master Server Time
29 Mar 2024 10:01:01 EDT (-0400)
  crash in origin/master (Message 11 to 17 of 17)  
<<< Previous 10 Messages Goto Initial 10 Messages
From: dick balaska
Subject: Re: crash in origin/master
Date: 13 Feb 2017 10:04:00
Message: <58a1cae0$1@news.povray.org>
Am 2017-02-12 19:14, also sprach clipka:
> Why do you ask? ;)

Mostly I'm irked at myself. This is the third reference to unix/readme 
in a week.  You'd think I'd figure out what that meant by now...


---
dik


Post a reply to this message

From: William F Pokorny
Subject: Re: crash in origin/master
Date: 13 Feb 2017 11:15:55
Message: <58a1dbbb$1@news.povray.org>
On 02/13/2017 06:35 AM, William F Pokorny wrote:
>
> FYI.
>
> Took an initial look and at, at least for the Ubuntu gnu debug compiles
> I have handy and current, I do not get the seg fault where the
> equivalent normal compile fails... Maybe an optimization tangled issue?
>
> I had a few existing normally compiled binaries handy from previous
> specific commits. The last that worked for me was commit 7d06e7f, Fri
> Sep 9 09:50:47 2016 +0200. The first that failed was commit c6a6e98, Dec
> 20 13:55:46 2016 +0100.
>
> I'll have to step out for a while, but I'll plan to pick this up later
> this morning after checking in here.
>
> Bill P.
>

OK. The trigger for this segmentation fault seems to be commit: c52c176 
"Implemented basic Wavefront OBJ import. (#90)" where we also made 
changes to:

source/core/bounding/boundingbox.cpp

and

source/core/bounding/boundingbox.h

The specific trigger is a change in a find_axis() for loop where the 
problem additions have been commented so things work:

     for(i = first; i < last; i++)
     {
         bbox = &(Finite[i]->BBox);

         if(bbox->lowerLeft[X] < mins[X])
             mins[X] = bbox->lowerLeft[X];

         if(bbox->lowerLeft[X] + bbox->size[X] > maxs[X])
             maxs[X] = bbox->lowerLeft[X]; // + bbox->size[X]; // <--

         if(bbox->lowerLeft[Y] < mins[Y])
             mins[Y] = bbox->lowerLeft[Y];

         if(bbox->lowerLeft[Y] + bbox->size[Y] > maxs[Y])
             maxs[Y] = bbox->lowerLeft[Y]; // + bbox->size[Y]; // <--

         if(bbox->lowerLeft[Z] < mins[Z])
             mins[Z] = bbox->lowerLeft[Z];

         if(bbox->lowerLeft[Z] + bbox->size[Z] > maxs[Z])
             maxs[Z] = bbox->lowerLeft[Z]; // + bbox->size[Z]; // <--
     }

I see no difference in results, but I do not know the bounding code well 
enough to know if just backing the above additions from c52c176 is safe.

The secondary bit of this tangle is needing -O2 or -O3 compiles with 
both the gnu compiler and clang to get the crash.

I found compiling everything normally with -03 then selectively 
re-compiling the files in just the source/backend/scene with -O makes 
the seg fault go away, but I am at a loss as to why.

The only code therein looking to be tangled with the bounding structure is:

view.cpp:bool View::CheckCameraHollowObject(const Vector3d& point, const 
BBOX_TREE *node)

which is in fact the code Dick saw in his fails.

Lastly, I've not been able to get a debug compile of any form to fail 
for me.

Where do we go from here with this one?

Regards, Bill P.


Post a reply to this message

From: dick balaska
Subject: Re: crash in origin/master
Date: 13 Feb 2017 11:47:43
Message: <58a1e32f$1@news.povray.org>
Am 2017-02-13 11:15, also sprach William F Pokorny:

>
> Lastly, I've not been able to get a debug compile of any form to fail
> for me.

Just --enable-debug does it for me. Without trying to disable -O3

-- 
dik


Post a reply to this message

From: clipka
Subject: Re: crash in origin/master
Date: 13 Feb 2017 14:03:14
Message: <58a202f2$1@news.povray.org>
Am 13.02.2017 um 17:15 schrieb William F Pokorny:

> OK. The trigger for this segmentation fault seems to be commit: c52c176
> "Implemented basic Wavefront OBJ import. (#90)" where we also made
> changes to:
> 
> source/core/bounding/boundingbox.cpp
> 
> and
> 
> source/core/bounding/boundingbox.h
> 
> The specific trigger is a change in a find_axis() for loop where the
> problem additions have been commented so things work:
> 
>     for(i = first; i < last; i++)
>     {
>         bbox = &(Finite[i]->BBox);
> 
>         if(bbox->lowerLeft[X] < mins[X])
>             mins[X] = bbox->lowerLeft[X];
> 
>         if(bbox->lowerLeft[X] + bbox->size[X] > maxs[X])
>             maxs[X] = bbox->lowerLeft[X]; // + bbox->size[X]; // <--
> 
>         if(bbox->lowerLeft[Y] < mins[Y])
>             mins[Y] = bbox->lowerLeft[Y];
> 
>         if(bbox->lowerLeft[Y] + bbox->size[Y] > maxs[Y])
>             maxs[Y] = bbox->lowerLeft[Y]; // + bbox->size[Y]; // <--
> 
>         if(bbox->lowerLeft[Z] < mins[Z])
>             mins[Z] = bbox->lowerLeft[Z];
> 
>         if(bbox->lowerLeft[Z] + bbox->size[Z] > maxs[Z])
>             maxs[Z] = bbox->lowerLeft[Z]; // + bbox->size[Z]; // <--
>     }

What exactly makes you think these changes have anything to do with the
segfault?

I very much suspect that instead of the changes causing the segfault,
the version prior to the commit just masked it somehow. One such
mechanism could be via simply shifting around code so that it reacts
differently to a highly unstable situation. Another mechanism (and the
one I'd currently put my money on) would be via creating a different
bounding box hierarchy.

If you have a closer look at the original `find_axis()` code, you'll
notice that it is very difficult to make heads or tails of the function
if you presume the implementation to be correct -- especially the lines
in question seem rather puzzling. If on the other hand you discard the
idea that the function is bug-free, and presume that the loop's original
intention was simply to set `mins` and `maxs` to the "bottom-left" and
"top-right" corner, respectively, of a box encompassing all the bounding
boxes iterated over, then things suddenly fall into place perfectly: The
function as a whole is then intended to compute the axis along which
that all-encompassing box extends the most. This also fits nicely with
the calling code.

If you go along with that interpretation, then the changes done in
c52c176 are a no-brainer bugfix; and in examining the calling code, it
must be concluded that although the original code would not have created
any truly broken bounding box hierarchies (in the sense that other code
would have gagged on them), it would in semi-rare cases have created
/different/ bounding box hierarchies than intended (and created by the
new version).


> I see no difference in results, but I do not know the bounding code well
> enough to know if just backing the above additions from c52c176 is safe.

The broken code has been undetected ever since its first implementation
in POV-Ray 2.0, so it sure doesn't cause artifacts. It might have an
impact on performance though.


> The only code therein looking to be tangled with the bounding structure is:
> 
> view.cpp:bool View::CheckCameraHollowObject(const Vector3d& point, const
> BBOX_TREE *node)
> 
> which is in fact the code Dick saw in his fails.

Indeed.

One interpretation would be that there is some other bug in the bounding
mechanism that only surfaces in rare circumstances, which happen to be
met in Dick's scene, but which used to be masked by the old bug in
`find_axis()`.

However, one thing that's seriously odd about this issue is that the
reported crash does /not/ happen during an attempt to access the
bounding hierarchy, but in an attempt to access
`viedData.GetSceneData()` -- which should work (or fail to do so)
totally independent of the bounding hierarchy. This is more indicative
of a totally wild pointer happening to trash `viewData`, or more
specifically the `sceneData` shared pointer therein. In that case, wish
us good luck trying to nail down the culprit, 'cause we'll need lots of it.

> Lastly, I've not been able to get a debug compile of any form to fail
> for me.
> 
> Where do we go from here with this one?

Good question. I haven't been able to trigger an error on Windows
either; neither with frame 4 (as suggested by the ini file provided by
Dick) nor with any frame from 1 to 153. And I don't think it makes much
sense to try the remaining 3000 frames of the animation.

One thing we might do is to go further back in the Git history, but use
the newer `find_axis()` code, in the hope that we might find the commit
that /really/ broke things.


Post a reply to this message

From: dick balaska
Subject: Re: crash in origin/master
Date: 13 Feb 2017 14:26:07
Message: <58a2084f@news.povray.org>
Am 2017-02-13 14:03, also sprach clipka:

> Good question. I haven't been able to trigger an error on Windows
> either; neither with frame 4 (as suggested by the ini file provided by
> Dick) nor with any frame from 1 to 153. And I don't think it makes much
> sense to try the remaining 3000 frames of the animation.

frame 4 was just the first frame that included those objects. I may have 
even ripped that logic out.  You won't get much else out of that code. 
I trimmed 175 source files down to 27 for this exercise.
I did strip the haus down to just the second floor (first floor!) and 
foundation; it might be funny to see what that actually looks like. ;)
I see no value in other frames. The tree is likely to parse out the same 
way for each frame.

I, too, have not been able to trigger an error on Windows.

-- 
dik


Post a reply to this message

From: William F Pokorny
Subject: Re: crash in origin/master
Date: 13 Feb 2017 16:40:57
Message: <58a227e9$1@news.povray.org>
On 02/13/2017 02:03 PM, clipka wrote:
>>
>> The specific trigger is a change in a find_axis() for loop where the
>> problem additions have been commented so things work:
>>
>>     for(i = first; i < last; i++)
>>     {
>>         bbox = &(Finite[i]->BBox);
>>
>>         if(bbox->lowerLeft[X] < mins[X])
>>             mins[X] = bbox->lowerLeft[X];
>>
>>         if(bbox->lowerLeft[X] + bbox->size[X] > maxs[X])
>>             maxs[X] = bbox->lowerLeft[X]; // + bbox->size[X]; // <--
>>
>>         if(bbox->lowerLeft[Y] < mins[Y])
>>             mins[Y] = bbox->lowerLeft[Y];
>>
>>         if(bbox->lowerLeft[Y] + bbox->size[Y] > maxs[Y])
>>             maxs[Y] = bbox->lowerLeft[Y]; // + bbox->size[Y]; // <--
>>
>>         if(bbox->lowerLeft[Z] < mins[Z])
>>             mins[Z] = bbox->lowerLeft[Z];
>>
>>         if(bbox->lowerLeft[Z] + bbox->size[Z] > maxs[Z])
>>             maxs[Z] = bbox->lowerLeft[Z]; // + bbox->size[Z]; // <--
>>     }
>
> What exactly makes you think these changes have anything to do with the
> segfault?
>

:-) Nothing except running the code as above makes the seg fault go 
away. It doesn't make sense to me either.

>
> Good question. I haven't been able to trigger an error on Windows
> either; neither with frame 4 (as suggested by the ini file provided by
> Dick) nor with any frame from 1 to 153. And I don't think it makes much
> sense to try the remaining 3000 frames of the animation.
>
> One thing we might do is to go further back in the Git history, but use
> the newer `find_axis()` code, in the hope that we might find the commit
> that /really/ broke things.
>

A good suggestion - thanks.

Let me try again to get some kind of debug version which seg faults - 
otherwise...

(Saw your comment Dick & thought I tried -O3 with debug symbols, but was 
running lots of various compiles this morning so maybe I slipped up.)

Bill P.


Post a reply to this message

From: William F Pokorny
Subject: Re: crash in origin/master
Date: 14 Feb 2017 08:27:15
Message: <58a305b3$1@news.povray.org>
On 02/13/2017 04:40 PM, William F Pokorny wrote:
>
> A good suggestion - thanks.
>
> Let me try again to get some kind of debug version which seg faults -
> otherwise...
>
> (Saw your comment Dick & thought I tried -O3 with debug symbols, but was
> running lots of various compiles this morning so maybe I slipped up.)
>
> Bill P.

After chasing a good many ghosts, this turns out to be another thread 
stack size issue. A stack overrun was causing the various strange results.

Posted a fix at: https://github.com/POV-Ray/povray/pull/235 against the 
3.7.1 release branch.

Note the AppVeyor build checks for the pull req are failing due a VS2015 
license expiring - ignore the red 'x'.

Bill P.


Post a reply to this message

<<< Previous 10 Messages Goto Initial 10 Messages

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.