POV-Ray : Newsgroups : povray.beta-test : thread count rendertimes unpredictable? Server Time
16 Jan 2025 12:18:41 EST (-0500)
  thread count rendertimes unpredictable? (Message 1 to 10 of 18)  
Goto Latest 10 Messages Next 8 Messages >>>
From: Bob Hughes
Subject: thread count rendertimes unpredictable?
Date: 18 Jan 2009 19:27:35
Message: <4973c8f7$1@news.povray.org>
I was comparing beta 30 and the radiosity-changed exe for any differences 
when using thread count 1 or 2 and found it varies depending on scene file 
complexity or animations. I was wondering if thread count was being checked 
by others.

Getting a 160% slower render sometimes when using Work_Threads=2 versus 1, 
and other times 60% (much faster). In the past I had been making sure to 
keep the threads at only 1 so renders wouldn't slow down on larger sized 
images or animations. I think there might have been a change but couldn't be 
sure about that.

Right now it seems to be unpredictable instead of guaranteeing a speed-up by 
going with more than 1 thread. Thought maybe this hasn't been discussed 
recently so wanted to bring it up again.

POV for Windows (SSE2's and non-SSE2's), Vista and AMD X2 dual core

Bob


Post a reply to this message

From: Chambers
Subject: Re: thread count rendertimes unpredictable?
Date: 18 Jan 2009 19:42:24
Message: <9E363925719942B18BA14BF08FF37968@HomePC>
Could you post what times you get with specific scenes?

...Ben Chambers
www.pacificwebguy.com


Post a reply to this message

From: Bob Hughes
Subject: Re: thread count rendertimes unpredictable?
Date: 19 Jan 2009 04:36:31
Message: <4974499f$1@news.povray.org>
"Chambers" <ben### [at] pacificwebguycom> wrote in message 
news:9E363925719942B18BA14BF08FF37968@HomePC...
> Could you post what times you get with specific scenes?


I hadn't checked any of the sample files before posting until now so I gave 
the scenes\advanced\benchmark.pov a try and found the render times to be 
like I would expect. Using 1 thread was slower at 26 minutes, while 2 
threads improves that to about 14 minutes; both SSE2's I tested it in, while 
non-SSE2 is similar yet slower overall of course.

A curiosity to me was how photons are reported, saying 5 threads, although I 
understand this isn't finished yet but since the file had it on I left it 
as-is. I didn't use radiosity at all. Here's one of the render time stats:

  Photon Time:      0 hours 11 minutes 11 seconds (671.876 seconds)
              using 5 thread(s) with 1028.810 CPU-seconds total
  Radiosity Time:   No radiosity
  Trace Time:       0 hours 13 minutes 31 seconds (811.406 seconds)
              using 2 thread(s) with 1578.745 CPU-seconds total

Looking more at the stats I've realized something is probably wrong about 
Max Level, it was saying the output image resolution (same width, height 
numbers). Don't know if that's just incorrect or really happening in the 
render but it's only when setting Work_Threads=2 not =1, only the 
*radiosity-changed* executable does this-- not the original beta 30.

If I find an included scene file which causes the huge render-time 
difference I will post back about it.

For at least one of my own files (actually a pov with collection of .inc's) 
the render-time differences can be incredible, at the same output image 
resolution just over 12 minutes using 1 thread in pvengine-sse2.exe, and 
about 96 minutes using 2 threads in pvengine32-sse2.exe (pvengine-sse2.exe 
beta 30 was 6 minutes faster, or 90 minutes, for whatever reason).

My guess was perhaps the splines causing trouble but since I haven't a clue 
how thread count can affect object types being rendered (rather vice versa) 
I can't really say what's going on. I don't understand the processes 
involved.

Here's stats when rendering one of my own pov files:
_____________________________________________________________________________
begin stats for pvengine-sse2 beta 30 1 thread
----------------------------------------------------------------------------
Finite Objects:         7474
Infinite Objects:          0
Light Sources:             2
Total:                  7476

Parser Time
  Parse Time:       0 hours  0 minutes  0 seconds (0.000 seconds)
              using 1 thread(s) with 1.934 CPU-seconds total
  Bounding Time:    0 hours  0 minutes  0 seconds (0.047 seconds)
              using 1 thread(s) with 0.046 CPU-seconds total

Render Options
  Quality:  9
  Bounding boxes.......On   Bounding threshold: 3
  Antialiasing.........On  (Method 1, Threshold 0.400, Depth 2, Jitter Off)

Render Statistics
Image Resolution 1600 x 1200

Pixels:          2040800   Samples:         4629652   Smpls/Pxl: 2.27
Rays:           12772602   Saved:           1587744   Max Level: 9/9

Ray->Shape Intersection          Tests       Succeeded  Percentage

Blob                            200144           35757     17.87
Blob Component                  167655           84980     50.69
Blob Bound                      800576          167655     20.94
Box                             895707          888096     99.15
Cone/Cylinder                 10696292         9385215     87.74
CSG Intersection                 10395             132      1.27
CSG Union                       462067          174964     37.87
Polygon                         143850           62685     43.58
Sphere                          505290          382893     75.78
Sphere Sweep                  18753062          540509      2.88
Surface of Revolution           355109          187896     52.91
Surface of Rev. Bound           355109          291122     81.98
Torus                           190213          103812     54.58
Torus Bound                     190213          104002     54.68
True Type Font                  817186          213488     26.12
Clipping Object                 540887          478611     88.49
Bounding Box                2413335789       531789506     22.04

Function VM calls:          9498297

Roots tested:              18806406   eliminated:                 1119
Shadow Ray Tests:          10683037   Succeeded:                 47102
Shadow Cache Hits:            40562
Reflected Rays:              168931   Total Internal:             8718
Refracted Rays:              160895
Transmitted Rays:           5772324

Smallest Alloc:                  12 bytes
Largest  Alloc:                  12 bytes
Total Alloc calls:                1         Free calls:       56312792

Render Time:
  Photon Time:      No photons
  Radiosity Time:   No radiosity
  Trace Time:       0 hours 12 minutes 21 seconds (741.718 seconds)
              using 1 thread(s) with 739.631 CPU-seconds total
POV-Ray finished
-
CPU time used: kernel 1.40 seconds, user 744.23 seconds, total 745.64 
seconds.
Elapsed time 746.60 seconds.
Render averaged 2571.65 PPS (2574.98 PPS CPU time) over 1920000 pixels.
----------------------------------------------------------------------------
end stats for pvengine-sse2 beta 30 1 thread
_____________________________________________________________________________
begin stats for pvengine32-sse2 (radiosity-changed) beta 30 2 threads
----------------------------------------------------------------------------

Finite Objects:         7474
Infinite Objects:          0
Light Sources:             2
Total:                  7476

Parser Time
  Parse Time:       0 hours  0 minutes  0 seconds (0.000 seconds)
              using 1 thread(s) with 1.591 CPU-seconds total
  Bounding Time:    0 hours  0 minutes  0 seconds (0.047 seconds)
              using 1 thread(s) with 0.046 CPU-seconds total

Render Options
  Quality:  9
  Bounding boxes.......On   Bounding threshold: 3
  Antialiasing.........On  (Method 1, Threshold 0.400, Depth 2, Jitter Off)

Render Statistics
Image Resolution 1600 x 1200

Pixels:          2040800   Samples:         4629652   Smpls/Pxl: 2.27
Rays:           12772602   Saved:           1587744   Max Level: 1600/1200

Ray->Shape Intersection          Tests       Succeeded  Percentage

Blob                            200144           35757     17.87
Blob Component                  167655           84980     50.69
Blob Bound                      800576          167655     20.94
Box                             895707          888096     99.15
Cone/Cylinder                 10696292         9385215     87.74
CSG Intersection                 10395             132      1.27
CSG Union                       462067          174964     37.87
Polygon                         143850           62685     43.58
Sphere                          505290          382893     75.78
Sphere Sweep                  18753062          540509      2.88
Surface of Revolution           355109          187896     52.91
Surface of Rev. Bound           355109          291122     81.98
Torus                           190213          103812     54.58
Torus Bound                     190213          104002     54.68
True Type Font                  817132          213488     26.13
Clipping Object                 540887          478611     88.49
Bounding Box                2413335438       531789436     22.04

Function VM calls:          9498297

Roots tested:              18806406   eliminated:                 1119
Shadow Ray Tests:          10683037   Succeeded:                 47102
Shadow Cache Hits:            40563
Reflected Rays:              168931   Total Internal:             8718
Refracted Rays:              160895
Transmitted Rays:           5772324

Smallest Alloc:                  12 bytes
Largest  Alloc:                  12 bytes
Total Alloc calls:                2         Free calls:       56314983

Render Time:
  Photon Time:      No photons
  Radiosity Time:   No radiosity
  Trace Time:       1 hours 36 minutes 33 seconds (5793.391 seconds)
              using 2 thread(s) with 7003.306 CPU-seconds total
POV-Ray finished
-
CPU time used: kernel 267.56 seconds, user 6745.00 seconds, total 7012.56 
seconds.
Elapsed time 5799.85 seconds, CPU vs elapsed time ratio 1.21.
Render averaged 331.04 PPS (273.79 PPS CPU time) over 1920000 pixels.
----------------------------------------------------------------------------
end stats for pvengine32-sse2 (radiosity-changed) beta 30 2 threads


Post a reply to this message

From: Chambers
Subject: Re: thread count rendertimes unpredictable?
Date: 19 Jan 2009 14:47:45
Message: <147C633062104D78B2FA2555A7C26198@HomePC>
> -----Original Message-----
> From: Bob Hughes [mailto:omniverse charter net]
> I hadn't checked any of the sample files before posting until now so I
> gave
> the scenes\advanced\benchmark.pov a try and found the render times to
> be
> like I would expect. Using 1 thread was slower at 26 minutes, while 2
> threads improves that to about 14 minutes; both SSE2's I tested it in,
> while
> non-SSE2 is similar yet slower overall of course.

That's about what I would expect going from 1 thread to 2.  The Trace
portion of POV scales extremely well (almost ideally) with adding cores.

> For at least one of my own files (actually a pov with collection of
> .inc's)
> the render-time differences can be incredible, at the same output
image
> resolution just over 12 minutes using 1 thread in pvengine-sse2.exe,
> and
> about 96 minutes using 2 threads in pvengine32-sse2.exe (pvengine-
> sse2.exe
> beta 30 was 6 minutes faster, or 90 minutes, for whatever reason).

Is there any way you could create a minimal scene that demonstrates this
discrepancy?  I'm rather interested in testing it myself, as I haven't
had any such problems.

If you can't minimize the scene, would you consider posting your file,
or a link to it?  I can't guarantee how useful it would be without
knowing how large it is, of course.

...Ben Chambers
www.pacificwebguy.com

A render isn't slow unless it won't finish until after your next
birthday.


Post a reply to this message

From: Bob Hughes
Subject: Re: thread count rendertimes unpredictable?
Date: 20 Jan 2009 04:37:08
Message: <49759b44$1@news.povray.org>
"Chambers" <ben### [at] pacificwebguycom> wrote in message 
news:147C633062104D78B2FA2555A7C26198@HomePC...
>> -----Original Message-----
>> For at least one of my own files (actually a pov with collection of
>> .inc's)
>> the render-time differences can be incredible, at the same output
> image
>> resolution just over 12 minutes using 1 thread in pvengine-sse2.exe,
>> and
>> about 96 minutes using 2 threads in pvengine32-sse2.exe (pvengine-
>> sse2.exe
>> beta 30 was 6 minutes faster, or 90 minutes, for whatever reason).
>
> Is there any way you could create a minimal scene that demonstrates this
> discrepancy?  I'm rather interested in testing it myself, as I haven't
> had any such problems.
>
> If you can't minimize the scene, would you consider posting your file,
> or a link to it?  I can't guarantee how useful it would be without
> knowing how large it is, of course.


Hey Ben, thanks for wanting to check on this but I haven't been able to get 
anywhere with it yet.
Something I misspoke of before was saying "splines" when I meant 
sphere_sweep (using cubic_spline).

I tried a separate test scene file with some objects (sor, blob, 
sphere_sweep, text, sphere, cylinder) taken from the original problem 
file(s), always renders with a speed-up using 2 threads. I can't seem to 
narrow it down to anything but the original render slows considerably at 1/4 
to 1/3 into it then remains slow, while the Work_Threads=1 helps it speed 
along by comparison.

I still haven't found another file able to cause the same kind of slowdown 
so it could be this one particular rendering I kept thinking of as being the 
problem when changing thread count, except if one does it I'm sure others 
must.

I ran a histogram in 3.6 to see what things would be using the most time 
(not able to in 3.7 beta?), sphere_sweep and overlapping cylinders with 
gradient pigment (some transparency) look the slowest. Not totally 
unexpected at all.

Vista search indexing was all I could think of outside of POV being at 
fault. I switched that off and nothing changed.

If I don't get anywhere soon I might email a link, if you wouldn't mind 
that. It's a time-line data chart I don't really want out loose on the 'net 
because of info used in it that would become outdated. But maybe it doesn't 
matter... guess I just can't see posting it due to the specific nature of 
it.

Bob


Post a reply to this message

From: Chambers
Subject: Re: thread count rendertimes unpredictable?
Date: 20 Jan 2009 10:10:26
Message: <5AA99CEE980A486397B6F8F3AE9F92A6@HomePC>
> -----Original Message-----
> From: Bob Hughes [mailto:omniverse charter net]
> I tried a separate test scene file with some objects (sor, blob,
> sphere_sweep, text, sphere, cylinder) taken from the original problem
> file(s), always renders with a speed-up using 2 threads. I can't seem
> to narrow it down to anything but the original render slows
considerably
> at 1/4 to 1/3 into it then remains slow, while the Work_Threads=1
helps it
> speed along by comparison.

One method I've used is to take a copy of the scene, and try removing
objects / simplifying textures one at a time, until the problem
disappears.  If you can't recreate the problem any other way, this
usually works.

> If I don't get anywhere soon I might email a link, if you wouldn't
mind
> that. It's a time-line data chart I don't really want out loose on the
> 'net because of info used in it that would become outdated. But maybe
it
> doesn't matter... guess I just can't see posting it due to the
specific nature
> of it.

That's fine, whatever is appropriate for the data you have.

...Ben Chambers
www.pacificwebguy.com


Post a reply to this message

From: Bob Hughes
Subject: Re: thread count rendertimes unpredictable?
Date: 21 Jan 2009 07:23:28
Message: <497713c0$1@news.povray.org>
I put the files at http://0mniverse.com/public/37threads2slow.zip so if 
anyone else wants to check whether it renders much slower with 
Work_Threads=2 in the 3.7 beta they could do so.

When emailing it to Ben I forgot about the clockmod.inc being used and you 
would need to get that from http://www.geocities.com/ccolefax/clockmod.html 
if you don't have it already. Also not sure about the fonts... everyone 
might not have the ones I do and I didn't check which ones I used. If any 
errors occur when attempting to render I think switching the fonts for 
others shouldn't change rendertime.

You're welcome to use the files in part yourselves if you wish, if sense can 
be made of my SDL, I just don't want the whole thing used for what I am 
using it for since I'm putting the resultant graph online already. Doubt 
that needed to be said...

Bob


Post a reply to this message

From: Chambers
Subject: Re: thread count rendertimes unpredictable?
Date: 21 Jan 2009 22:13:29
Message: <ABFA7CDD4CD84294B7B1B9E3C8A84A19@HomePC>
By the way, I also had to create a dummy file "holiday symbols.inc" for
it to render.

...Ben Chambers
www.pacificwebguy.com


Post a reply to this message

From: Chambers
Subject: Re: thread count rendertimes unpredictable?
Date: 21 Jan 2009 22:28:51
Message: <3E8BA04547084C94A0AF378EADFB4C29@HomePC>
Using Beta 30 x64, I got 70 seconds with one thread and 34 with two.

Using the sse2 version of the modified executable, I got 72 seconds with
one thread and 76 seconds with two!

Something is definitely going on here... I'll be playing with the scene
file...

...Ben Chambers
www.pacificwebguy.com


Post a reply to this message

From: Chambers
Subject: Re: thread count rendertimes unpredictable?
Date: 21 Jan 2009 23:16:01
Message: <20CECB82D05043A6B6A700156C442C4B@HomePC>
I don't have more time to work with it today, but I can definitely say
that the effect seen is related to the overlapping graphs and the
background.  Eliminating sets of data or the background both have the
effect of reducing the effect.

...Ben Chambers
www.pacificwebguy.com


Post a reply to this message

Goto Latest 10 Messages Next 8 Messages >>>

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.