POV-Ray: Newsgroups: povray.beta-test: v3.8+ crackle instability (facets?) with >1 uses per thread.: Re: v3.8+ crackle instability (facets?) with >1 uses per thread.

POV-Ray : Newsgroups : povray.beta-test : v3.8+ crackle instability (facets?) with >1 uses per thread. : Re: v3.8+ crackle instability (facets?) with >1 uses per thread.		Server Time 3 Jul 2025 06:55:39 EDT (-0400)

From: William F Pokorny
Date: 11 Jun 2024 11:39:12
Message: <66686fa0$1@news.povray.org>

On 6/11/24 03:43, Thorsten wrote:
> Hi Bill,
> 
> the other issue to consider is that while there is no user interface for 
> it, in theory multiple renders of the same scene can run in parallel. 
> The actual solution to the whole problem is to keep the data needed not 
> only thread-local but look carefully at what is actually cached and then 
> ideally have it block local (also meaning, as with thread-local storage, 
> that the pattern changes with render block size) or even better pixel 
> local (no change with block size). To avoid the access to thread-local 
> storage, the whole rendering actually could be overhauled (which would 
> be good anyway) to move from a recursive to a stack based approach. That 
> way the needed local data could be (more easily) passed as argument down 
> to patterns ... but expect half a year full time to implement something 
> like this.
> 
> Thorsten

Hi Thorsten,

Thank you for your thoughts about the situation.

One thing I've not done is think about all patterns / perturbations / 
shape, thread caching with respect to overlapping in-thread storage use. 
In other words, what other problems like this might be sitting in the 
code today...

On the blocking, you got me thinking one nearer term option with crackle 
and facets might be to track the pattern / perturbation pointers 
themselves alongside the usual cube centers. In cases where we get a 
hit, but the pointers themselves don't match, we'd act like we missed 
and create a new cache entry. Rather than stick that 'overlapping hit' 
entry in the cache, we'd do the distance measures locally and discard 
the entry. Not optimal, more storage for the cache, but it would be 
better than just turning the cache off.

On storage block or pixel local/thread storage. Better I'd say, but so 
long as the patterns might share the storage, I think it still leaves us 
exposed given how the crackle / facets patterns work today.

Overhauling the rendering approach. Yeah, likely due and good, but not 
at all trivial as you say. I'm not myself sure how such a restructuring 
should look in total.

With the solver work I did now 5-6 years ago, I came to the conclusion a 
fused shape/solver approach would be far better given we are 
ray-tracing. See:

https://news.povray.org/povray.programming/thread/%3C5d0f64ff%241%40news.povray.org%3E/

When I think about really implementing that approach, I also start to 
think about how a different approach to parallelism than our block based 
approach could be good. One where we spin up the combined 
shape/solver(s) as processes to which we'd send batches of rays at a 
time and get back batches of intersections... Yeah, I'm practically 
dreaming, but pretty sure that sort of set up would be best for the 
merged uni-variate, polynomial solver/shape approach. How it well that 
structure would work overall - I'm not at all sure. :-)

As a practical near term solution, one thing I want to try is similar to 
what I did with the four ripple/wave value-pattern/normal-perturbation 
re-writes. I dumped already calculated locations for ones always 
calculated on the fly. At the default source location count of 10, the 
hit for not storing the locations was 20% give or take - IIRC.

If I can figure out a way to re-write the crackle and facets at that 
sort of performance hit, I'll probably just dump all the caching / 
thread local storage in total for local stack based storage.

Whether I can accomplish such a re-write - at a performance hit not too 
bad -is an open question at the moment. Not the least for the reason 
it's a chunk of work which well might not work out as a solution in the 
end - so I'm procrastinating.

Bill P.

Post a reply to this message