POV-Ray : Newsgroups : povray.macintosh : Stack Size Testers Wanted Server Time
18 Sep 2024 05:26:11 EDT (-0400)
  Stack Size Testers Wanted (Message 1 to 4 of 4)  
From: clipka
Subject: Stack Size Testers Wanted
Date: 15 Feb 2019 19:40:46
Message: <5c675c0e$1@news.povray.org>
Hi folks,

I'm looking for guinea pigs for a particular test.


Here's the background:

A while ago we (well, some of you) were having issues with crashes due 
to insufficient thread stack size on Mac OS X, or in one case even on 
Linux; we solved these issues with a workaround to override the 
per-thread stack size (or, in the Linux case, increase that override).

In the meantime, some changes have been made to the code which should 
have reduced POV-Ray's stack requirements, so my hope is that the 
workaround is no longer necessary - which would be great news, because 
we could get rid of the entire boost thread library.


Here's where you come in:

If you have ever experienced problems related to the thread stack size, 
and still have a scene available that ran into these issues, then I'd 
like to enlist your help.

@dick balaska:

I'm not sure if you're aware, but I know you were affected; see your 
post on 2017-02-12 in povray.beta-test titled "crash in origin/master" 
(http://news.povray.org/povray.beta-test/thread/%3C58a1e32f%241%40news.povray.org%3E/)


Here's what I'd like you to do:


(1) Reproduce the old problem and workaround [optional]

- Grab the source code of a sufficiently OLD v3.8.0-alpha (BEFORE 
v3.8.0-alpha.9436902; anything built BEFORE December 2017 should do), 
v3.7.1-alpha/beta or even the latest v3.7.0.

- In the file `source/backend/configbackend.h`, place the following 
lines at the end of the file:

     #undef POV_THREAD_STACK_SIZE
     #define POV_THREAD_STACK_SIZE (512 * 1024) // 512 KiB

- Build POV-Ray.

- Run whatever scene you remember crashing on you.

- Verify that the scene does indeed crash. (If not, your scene does not 
seem to be a suitable test candidate.)

- If you want to go the extra mile, increase POV_THREAD_STACK_SIZE to 
see at which size the scene ceases to crash. (I recommend doubling the 
value; you shouldn't have to go any further than 8*1024*1024.)


(2) Test whether the workaround is still required

- Grab the source code of a sufficiently NEW v3.8.0-alpha (AT LEAST 
v3.8.0-alpha.9436902 or newer; anything built in 2018 or 2019 should do; 
I'd recommend the newest tagged alpha though).

- Apply the same changes to `source/backend/configbackend.h` as 
described above.

- Build POV-Ray.

- Run your test scene.

- Observe whether the scene crashes or not.

- If you want to go the extra mile, change POV_THREAD_STACK_SIZE to see 
at which size the behaviour changes: If the scene crashes, increase the 
value until it ceases to; if the scene does not crash, decrease the 
value until it does. (I recommend doubling / halving the value.)


(3) Report your observations.


Your help is very much appreciated!


Post a reply to this message

From: William F Pokorny
Subject: Re: Fwd: Stack Size Testers Wanted
Date: 16 Feb 2019 10:17:34
Message: <5c68298e$1@news.povray.org>
On 2/15/19 7:44 PM, clipka wrote:
> Forgot to cross-post to `povray.beta-test`:
> 
> -------- Weitergeleitete Nachricht --------
> Betreff: Stack Size Testers Wanted
> Datum: Sat, 16 Feb 2019 01:40:46 +0100
> Von: clipka <ano### [at] anonymousorg>
> Newsgruppen: povray.macintosh
> 
> Hi folks,
> 
> I'm looking for guinea pigs for a particular test.
> 
> 
> Here's the background:
> 
> A while ago we (well, some of you) were having issues with crashes due 
> to insufficient thread stack size on Mac OS X, or in one case even on 
> Linux; we solved these issues with a workaround to override the 
> per-thread stack size (or, in the Linux case, increase that override).
> 
...

I've attached an attractors.zip file to a comment to the github issue:

https://github.com/POV-Ray/povray/issues/239

which is a related open issue. One which should be closed if recent 
updates have better addressed the problem.

The zip contains a set of test cases from ThH (Thorsten) at the bottom 
should others want to test in their environment(s). Two or more of which 
failed still after the updates in early 2017. Third down being one of 
these.

These now all run cleanly for me. Ubuntu 18.04 at master (v38) at commit 
054e75c.

And 3rd fails still for me going back to v3.71 release branch at commit 
9808f53 (Jun 24 2017 with updates).

Christoph, Remember we at some point ran across some information 
suggesting windows runs with stack monitoring and splitting/growing the 
stack. Perhaps why windows could always run with a smaller stack and why 
we never saw such fails on windows. Stack splitting as needed due growth 
could be turned on with the gnu compiler - at a performance hit. This 
suggested we might want to turn off such splitting in windows for a 
performance gain. All two year old looks though, and nothing ever tried 
as far as I know, so who knows how things stand today. If just a windows 
compiler setting for you though, might be worth just trying a windows 
compile with splitting off if we are in good shape thread-stack size 
wise.

This leads to me close with being somewhat worried about the suggested 
test size of 512KB. We still have some largish stack allocations - the 
sturm solver when we increased the max order from 15 to 35(why ?) in 
v3.7 now causes a 600KB plus (??? don't recall exactly) allocation on 
the stack - at a considerable performance penalty I'll add(1) - no 
matter the actual incoming equation order. This alone might cause your 
suggested size of 512KB to fail. Well, except perhaps on windows where 
presumably splitting still in place.

I did not run master with 512KB as you suggested or a clang compile with 
the current larger default. I'm focused elsewhere at the moment and 
changes to the more common headers are painfully slow for me to compile 
and try.

Bill P.

(1) - My C++ ish attempts to address this have all been really slow or 
not worked at all when I try to get fancy/fast with raw memory 
allocations and pointers. C and some C++ compilers (IBM's XLC being one) 
support the needed dynamic structure array-size allocation mechanism 
with high performance. This a reason - among others - why I'm toying 
some with taking the common solver code back to straight C. We will 
soon, I think? - move to a mixed C++/C mode in picking up FreeType 
which, as I remember, is actually a C library - so perhaps not so crazy.


Post a reply to this message

From: clipka
Subject: Re: Fwd: Stack Size Testers Wanted
Date: 16 Feb 2019 16:39:11
Message: <5c6882ff$1@news.povray.org>
Am 16.02.2019 um 16:17 schrieb William F Pokorny:

> This leads to me close with being somewhat worried about the suggested 
> test size of 512KB.

The "target" size of 512 kiB is an invariant - that's the per-thread 
stack size we would get on OS X by default.

> We still have some largish stack allocations - the 
> sturm solver when we increased the max order from 15 to 35(why ?) in 
> v3.7 now causes a 600KB plus (??? don't recall exactly) allocation on 
> the stack - at a considerable performance penalty I'll add(1) - no 
> matter the actual incoming equation order. This alone might cause your 
> suggested size of 512KB to fail. Well, except perhaps on windows where 
> presumably splitting still in place.

Where would those 600 kB be allocated?

The only out-of-the-ordinary local variable I see in all of the 
polynomial solver code is `sseq` in `polysolve()`, which is 36 elements 
of type `polynomial`, which in turn consists of 1 integer and 36 
doubles. That's about 10 kB, not 600.

> (1) - My C++ ish attempts to address this have all been really slow or 
> not worked at all when I try to get fancy/fast with raw memory 
> allocations and pointers.

If the variable is local to a function that is guaranteed to never be 
called recursively, the easiest solution to guarantee speedy allocation 
and not hogging the stack is to change the variable declaration to 
`thread_local`. This is essentially the same as `static`, but with each 
thread having its own copy of the variable.

> C and some C++ compilers (IBM's XLC being one) 
> support the needed dynamic structure array-size allocation mechanism 
> with high performance. This a reason - among others - why I'm toying 
> some with taking the common solver code back to straight C. We will 
> soon, I think? - move to a mixed C++/C mode in picking up FreeType 
> which, as I remember, is actually a C library - so perhaps not so crazy.

Veto to that: All of POV-Ray should be valid C++11 code, so that POV-Ray 
can be compiled with only a single C++ compiler. 3rd party libraries are 
another matter; we expect those to be present in binary form (except on 
Windows, where we expect their source code to be compatible with MSVC's 
dialect of C/C++), and the headers of modern C libs are almost 
invariably designed as hybrid C/C++ headers, in the sense that they use 
pre-processor features to behave as either C or C++ code, depending on 
the compiler used.

There is currently only one genuine C source file ihn POV-Ray proper, 
namely `povms.c`, and even that one is designed as a C/C++ hybrid and 
compiled by including it from a `.cpp` file. (The rationale for this C 
file's existence being that it - in theory - allows 3rd party genuine C 
programs to drive the POV-Ray back-end via POVMS.)


Post a reply to this message

From: clipka
Subject: Re: Fwd: Stack Size Testers Wanted
Date: 16 Feb 2019 20:53:51
Message: <5c68beaf$1@news.povray.org>
Am 16.02.2019 um 16:17 schrieb William F Pokorny:

> I've attached an attractors.zip file to a comment to the github issue:
> 
> https://github.com/POV-Ray/povray/issues/239

Thanks - yes, that seems to be the right stuff for testing this.

To reduce parsing and rendering time, I'm testing a slightly modified 
version with 100 instead of 1000 iterations, i.e.:

clifford (1.6,1.9, 0.5,-1.4, -0.4,-0.1, -1.6,1.8, 100, 0.0005, 0.00375, 
cliff_mat, cliff_trans)


On my Windows machine and the x64 Debug version (which should be a bit 
more memory-hungry than the others), I've just compared commit accc779d 
(which did away with the `FixedSimpleVector`, back in 2017-11-10) with 
the commit immediately before, with the following results:

- Before: Required a POV_THREAD_STACK_SIZE of 2 MiB.
- After: Requires a lousy 16 kiB.

I also ran the last test with 1000 iterations just for giggles, with the 
same result: 16 kiB was still sufficient.


I also see no trouble with the Sturmian root solver at 16 kiB (tested 
with `objects/torus1.pov` sample scene, with `sturm` added to torus 
definitions).


So, bottom line: Judging from this first test, even OS X's 512 kiB 
should be enough for everybody now ;)


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.