POV-Ray : Newsgroups : povray.general : povray vs uberpov am3 Server Time
20 Apr 2024 07:28:24 EDT (-0400)
  povray vs uberpov am3 (Message 21 to 30 of 33)  
<<< Previous 10 Messages Goto Latest 10 Messages Next 3 Messages >>>
From: jr
Subject: Re: povray vs uberpov am3
Date: 18 Sep 2020 20:25:01
Message: <web.5f654f09aa461a8c4d00143e0@news.povray.org>
"jr" <cre### [at] gmailcom> wrote:
> Ash Holsenback <no### [at] spamcom> wrote:
> > On 9/18/20 7:22 AM, William F Pokorny wrote:
> > > On 9/18/20 6:36 AM, William F Pokorny wrote:
> > > ...
> > > and the cpu time jumps.
> > > ...
> fwiw, ...

no sleep.  :-(  so, if you (WFP)/anyone is interested, I can add to the ..
mystery.

I have the same alpha.10064268 installed on two machines, and I mean the same.
both built from the same tarball using nearly identical package build scripts.
on one machine, as posted, I see a decrease in time by (just) better than 50%
for the coloured background.  on the other, the alpha behaves as you describe,
time up from seconds to over a quarter of an hour.  wow.  I also tried two older
3.8.0-alphas (9893777 and 9945627) on a third box, they too run faster by around
50% when colour is used.  one thing I noticed is that where time decreases, on
the older boxes, the "CPU detected" reads "Intel,SSE2,AVX", whereas for the ..
misbehaving alpha (newer box) it's "Intel,SSE,AVX,AVX2,FMA3".  no idea whether
that is relevant, still, it'd be interesting to know what CPU flags others see.
anyway, as Kenneth intimated, computing, eh? -- all good fun.


regards, jr.


Post a reply to this message

From: William F Pokorny
Subject: Re: povray vs uberpov am3
Date: 19 Sep 2020 08:22:46
Message: <5f65f816$1@news.povray.org>
On 9/18/20 8:21 PM, jr wrote:
> "jr" <cre### [at] gmailcom> wrote:
...
> 
> no sleep.  :-(  so, if you (WFP)/anyone is interested, I can add to the ..
> mystery.
> 
> I have the same alpha.10064268 installed on two machines, and I mean the same.
> both built from the same tarball using nearly identical package build scripts.
> on one machine, as posted, I see a decrease in time by (just) better than 50%
> for the coloured background.  on the other, the alpha behaves as you describe,
> time up from seconds to over a quarter of an hour.  wow.  I also tried two older
> 3.8.0-alphas (9893777 and 9945627) on a third box, they too run faster by around
> 50% when colour is used.  one thing I noticed is that where time decreases, on
> the older boxes, the "CPU detected" reads "Intel,SSE2,AVX", whereas for the ..
> misbehaving alpha (newer box) it's "Intel,SSE,AVX,AVX2,FMA3".  no idea whether
> that is relevant, still, it'd be interesting to know what CPU flags others see.
> anyway, as Kenneth intimated, computing, eh? -- all good fun.
> 
> 

Jim, and all thanks for the data. And jr, especially, for picking up the
"Intel,SSE,AVX,AVX2,FMA3" bit. It is what I see here for run time 
optimization. I've hit a few other cases where - for whatever reasons - 
that run time CPU optimization isn't faster for me, though usually it 
is. Nothing previously was anywhere as dramatic a slow down.

I've never looked over Christoph's am3 code. Guess I'll go take a peek.

Bill P.


Post a reply to this message

From: William F Pokorny
Subject: Re: povray vs uberpov am3
Date: 19 Sep 2020 10:08:46
Message: <5f6610ee$1@news.povray.org>
On 9/19/20 8:22 AM, William F Pokorny wrote:
...
> 
> I've never looked over Christoph's am3 code. Guess I'll go take a peek.
> 

Hey jr, Given you have machine which shows the slow down and one which 
does not, would you be willing to add the following test in compiles for 
both machines? Then try white/black and white/color renders on both?

In tracetask.cpp, at about line 764, insert code so things look like:

double cf = confidenceFactor[neighborSamples-1];
PreciseRGBTColour sqrtvar = Sqrt(variance);

if (variance.red()<0)
     throw POV_EXCEPTION_STRING("kaboom");

PreciseRGBTColour confidenceDelta = sqrtvar * cf;

I think in povr I've fixed the stats - so your line counts might be a 
little different.

I hacked the up front confidence array (as vector) to something simpler 
so my b/w runs in about 2 seconds and my color one in 34s. Stats now 
show for the two renders:

Pixels: 360000   Samples:  609399 Smpls/Pxl: 1.69
and
Pixels: 360000   Samples: 9511337 Smpls/Pxl: 26.42

Bill P.


Post a reply to this message

From: jr
Subject: Re: povray vs uberpov am3
Date: 19 Sep 2020 10:55:01
Message: <web.5f661b45aa461a8c4d00143e0@news.povray.org>
hi,

William F Pokorny <ano### [at] anonymousorg> wrote:
> On 9/19/20 8:22 AM, William F Pokorny wrote:
> ...
> Hey jr, Given you have machine which shows the slow down and one which
> does not, would you be willing to add the following test in compiles for
> both machines? Then try white/black and white/color renders on both?
>
> In tracetask.cpp, at about line 764, insert code so things look like:
>
> double cf = confidenceFactor[neighborSamples-1];
> PreciseRGBTColour sqrtvar = Sqrt(variance);
>
> if (variance.red()<0)
>      throw POV_EXCEPTION_STRING("kaboom");
>
> PreciseRGBTColour confidenceDelta = sqrtvar * cf;

sure, will find time in the next few days.  (will I need line above and below
for context or is it "self-evident"?  :-))

a specific "white/black and white/colour" test scene or are you referring to
Ton's code modified?

> ...


regards, jr.


Post a reply to this message

From: William F Pokorny
Subject: Re: povray vs uberpov am3
Date: 19 Sep 2020 13:14:29
Message: <5f663c75$1@news.povray.org>
On 9/19/20 10:52 AM, jr wrote:
...
>> In tracetask.cpp, at about line 764, insert code so things look like:
>>
>> double cf = confidenceFactor[neighborSamples-1];
>> PreciseRGBTColour sqrtvar = Sqrt(variance);
>>
>> if (variance.red()<0)
>>       throw POV_EXCEPTION_STRING("kaboom");
>>
>> PreciseRGBTColour confidenceDelta = sqrtvar * cf;
> 
> sure, will find time in the next few days.  (will I need line above and below
> for context or is it "self-evident"?  :-))

The lines above and below the test and throw are provided for context.

> a specific "white/black and white/colour" test scene or are you referring to
> Ton's code modified?

Ton's code modified for color and not. What you were running on both 
machines.

> 
>> ...
> 

Thanks. It's less a priority 4 hours later :-), but I would like to see 
the results.

--------
Both implementations have the same basic problem. The occasional square 
root of negative values (Domain errors). Both would kinda / sorta work 
regardless. Due compiler versions, flag settings or maybe machine types 
- seems, on the domain errors, you might not break on the color channel 
threshold and instead only break on hitting the max samples check.

To fix those of you compiling at home can change the code in 
tracetask.cpp to look like:

if (samples >= minSamples)
{
     if (samples >= maxSamples)
         break;

     PreciseRGBTColour variance =
         (neighborSumSqr - Sqr(neighborSum)/neighborSamples)
         / (neighborSamples-1); // Sometimes very slightly neg
     variance = PreciseRGBTColour(           // Fix
                 max(0.0,variance.red()),    // Fix
                 max(0.0,variance.green()),  // Fix
                 max(0.0,variance.blue()),   // Fix
                 max(0.0,variance.transm())  // Fix
             );                              // Fix
     double cf = confidenceFactor[neighborSamples-1];
     PreciseRGBTColour sqrtvar = Sqrt(variance);
     PreciseRGBTColour confidenceDelta = sqrtvar * cf;
     if (confidenceDelta.red() +
         confidenceDelta.green() +
         confidenceDelta.blue() +
         confidenceDelta.transm() <= threshold)
         break;
}

Ton, The reason you need smaller thresholds in v3.8 for equivalent 
result in uberpov is the uberpov threshold test was:

if ((confidenceDelta.red()    <= threshold) &&
     (confidenceDelta.green()  <= threshold) &&
     (confidenceDelta.blue()   <= threshold) &&
     (confidenceDelta.transm() <= threshold))
     break;

With the fix in povr I can get pretty good white on black, or white on
Acajou results in roughly 16 seconds and 18 seconds, respectively. 
Similar quality white on black results took me nearly 20 minutes without 
the fix above.

The command lines I used were:

povr2 +d +p +q9 +am3 +a0.05 +ac0.9 +ss123456 +r6 +w600 +h600 ...
povr2 +d +p +q9 +am3 +a0.025 +ac0.9 +ss123456 +r6 +w600 +h600 ...

The color threshold does need to be smaller for similar method3 result 
and, on thinking about it, this makes sense. The absolute color 
difference white to the background is reduced. I'll post an image of the 
two results to povary.binaries.

Aside: There are comments in the code related to allowing a min sampling 
of other than 1, which I think could be helpful, but I don't immediately 
see how to implement it.

---------
So, all good right? Nope... We have an underlying problem at which I've 
yet too look. We have a color template class with a Sqrt method that 
doesn't handle negative color values. POV-Ray allows negative color 
values. Do we update it to return negative square roots --> (sqrt(abs()) 
and flip result neg on neg inputs? What might changing this method mean 
to the code base as whole? I don't know.

Suppose we might want a maxZero color vector method too - if that a 
common need.

Bill P.


Post a reply to this message

From: jr
Subject: Re: povray vs uberpov am3
Date: 19 Sep 2020 14:55:00
Message: <web.5f665310aa461a8c4d00143e0@news.povray.org>
hi,

William F Pokorny <ano### [at] anonymousorg> wrote:
> On 9/19/20 10:52 AM, jr wrote:
> ...
> > sure, will find time in the next few days.  (will I need line above and below
> > for context or is it "self-evident"?  :-))
>
> The lines above and below the test and throw are provided for context.
>
> > a specific "white/black and white/colour" test scene or are you referring to
> > Ton's code modified?
>
> Ton's code modified for color and not. What you were running on both
> machines.

yeah, "white" being the line colour sunk in just after posting.  :-)

>
> Thanks. It's less a priority 4 hours later :-), but I would like to see
> the results.

excellent.  of course.

> ...


regards, jr.


Post a reply to this message

From: Ton
Subject: Re: povray vs uberpov am3
Date: 19 Sep 2020 21:20:01
Message: <web.5f66ad8daa461a8c6c5bf0280@news.povray.org>
Thanks Bill.

I modified the uberpov and povray sources with your modification, and expecially
povray renders a lot faster, from 235 seconds to 16. The resulting image, to me,
looks the same.

Once again thanks for your effort.

Cheers
Ton.


Post a reply to this message

From: jr
Subject: Re: povray vs uberpov am3
Date: 21 Sep 2020 17:30:05
Message: <web.5f691abaaa461a8c4d00143e0@news.povray.org>
"jr" <cre### [at] gmailcom> wrote:
> William F Pokorny <ano### [at] anonymousorg> wrote:
> > On 9/19/20 10:52 AM, jr wrote:
> > ...
> > Thanks. It's less a priority 4 hours later :-), but I would like to see
> > the results.
>
> excellent.  of course.

applied your fix on the newer machine, and now see (there, too) a decrease in
time of better than 50%; b/w - 16.105 cpu-seconds, c/w - 6.877 (res 960x720).
the other machine needs cleaning out (physically, runs hotter than I like)
before I get to patching POV-Ray, can post edited transcript of respective runs
in a few days, if you still want "full details".


regards, jr.


Post a reply to this message

From: William F Pokorny
Subject: Re: povray vs uberpov am3
Date: 22 Sep 2020 05:50:09
Message: <5f69c8d1$1@news.povray.org>
On 9/21/20 5:27 PM, jr wrote:
> "jr" <cre### [at] gmailcom> wrote:
...
> before I get to patching POV-Ray, can post edited transcript of respective runs
> in a few days, if you still want "full details".
> 

We can let it go.

Good to see the performance improvements(1). I was still curious about 
the color channel < 0 throw results on both machines. But, the 
performance improvements from you and Ton I think pretty clearly 
indicate the domain error is the start of the problem on all 
machine/compiler/os combos and what happens after we can take as 
implementation differences for undefined behavior.

(1) - After posting, I played with many renders at different settings 
using Ton's test case. It's hard to get rid of the last little open bits 
without taking more initial samples. Changes in the seed moves them 
around and you could then merge multiple images, but that's ugly. The 
only somewhat reliable way I could get there in POV-Ray was by rendering 
a larger image. Adding the capability for min samples >1 to am3 
necessary, I think, for a best fleshed out implementation. Someday...

Bill P.


Post a reply to this message

From: jr
Subject: Re: povray vs uberpov am3
Date: 22 Sep 2020 08:35:09
Message: <web.5f69ef05aa461a8c4d00143e0@news.povray.org>
hi,

William F Pokorny <ano### [at] anonymousorg> wrote:
> On 9/21/20 5:27 PM, jr wrote:
> > "jr" <cre### [at] gmailcom> wrote:
> ...
> > before I get to patching POV-Ray, can post edited transcript of respective runs
> > in a few days, if you still want "full details".
>
> We can let it go.
>
> Good to see the performance improvements(1). I was still curious about
> the color channel < 0 throw results on both machines.

:-)  I'll post post-patch results for both machines in the coming days.  agree
though, a very worthwhile performance boost (and adding that single line was
easy to boot), thank you.

> .... The
> only somewhat reliable way I could get there in POV-Ray was by rendering
> a larger image. ...

yeah, learned that from NK's comments.  :-)


regards, jr.


Post a reply to this message

<<< Previous 10 Messages Goto Latest 10 Messages Next 3 Messages >>>

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.