|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
I was comparing beta 30 and the radiosity-changed exe for any differences
when using thread count 1 or 2 and found it varies depending on scene file
complexity or animations. I was wondering if thread count was being checked
by others.
Getting a 160% slower render sometimes when using Work_Threads=2 versus 1,
and other times 60% (much faster). In the past I had been making sure to
keep the threads at only 1 so renders wouldn't slow down on larger sized
images or animations. I think there might have been a change but couldn't be
sure about that.
Right now it seems to be unpredictable instead of guaranteeing a speed-up by
going with more than 1 thread. Thought maybe this hasn't been discussed
recently so wanted to bring it up again.
POV for Windows (SSE2's and non-SSE2's), Vista and AMD X2 dual core
Bob
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Could you post what times you get with specific scenes?
...Ben Chambers
www.pacificwebguy.com
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
"Chambers" <ben### [at] pacificwebguycom> wrote in message
news:9E363925719942B18BA14BF08FF37968@HomePC...
> Could you post what times you get with specific scenes?
I hadn't checked any of the sample files before posting until now so I gave
the scenes\advanced\benchmark.pov a try and found the render times to be
like I would expect. Using 1 thread was slower at 26 minutes, while 2
threads improves that to about 14 minutes; both SSE2's I tested it in, while
non-SSE2 is similar yet slower overall of course.
A curiosity to me was how photons are reported, saying 5 threads, although I
understand this isn't finished yet but since the file had it on I left it
as-is. I didn't use radiosity at all. Here's one of the render time stats:
Photon Time: 0 hours 11 minutes 11 seconds (671.876 seconds)
using 5 thread(s) with 1028.810 CPU-seconds total
Radiosity Time: No radiosity
Trace Time: 0 hours 13 minutes 31 seconds (811.406 seconds)
using 2 thread(s) with 1578.745 CPU-seconds total
Looking more at the stats I've realized something is probably wrong about
Max Level, it was saying the output image resolution (same width, height
numbers). Don't know if that's just incorrect or really happening in the
render but it's only when setting Work_Threads=2 not =1, only the
*radiosity-changed* executable does this-- not the original beta 30.
If I find an included scene file which causes the huge render-time
difference I will post back about it.
For at least one of my own files (actually a pov with collection of .inc's)
the render-time differences can be incredible, at the same output image
resolution just over 12 minutes using 1 thread in pvengine-sse2.exe, and
about 96 minutes using 2 threads in pvengine32-sse2.exe (pvengine-sse2.exe
beta 30 was 6 minutes faster, or 90 minutes, for whatever reason).
My guess was perhaps the splines causing trouble but since I haven't a clue
how thread count can affect object types being rendered (rather vice versa)
I can't really say what's going on. I don't understand the processes
involved.
Here's stats when rendering one of my own pov files:
_____________________________________________________________________________
begin stats for pvengine-sse2 beta 30 1 thread
----------------------------------------------------------------------------
Finite Objects: 7474
Infinite Objects: 0
Light Sources: 2
Total: 7476
Parser Time
Parse Time: 0 hours 0 minutes 0 seconds (0.000 seconds)
using 1 thread(s) with 1.934 CPU-seconds total
Bounding Time: 0 hours 0 minutes 0 seconds (0.047 seconds)
using 1 thread(s) with 0.046 CPU-seconds total
Render Options
Quality: 9
Bounding boxes.......On Bounding threshold: 3
Antialiasing.........On (Method 1, Threshold 0.400, Depth 2, Jitter Off)
Render Statistics
Image Resolution 1600 x 1200
Pixels: 2040800 Samples: 4629652 Smpls/Pxl: 2.27
Rays: 12772602 Saved: 1587744 Max Level: 9/9
Ray->Shape Intersection Tests Succeeded Percentage
Blob 200144 35757 17.87
Blob Component 167655 84980 50.69
Blob Bound 800576 167655 20.94
Box 895707 888096 99.15
Cone/Cylinder 10696292 9385215 87.74
CSG Intersection 10395 132 1.27
CSG Union 462067 174964 37.87
Polygon 143850 62685 43.58
Sphere 505290 382893 75.78
Sphere Sweep 18753062 540509 2.88
Surface of Revolution 355109 187896 52.91
Surface of Rev. Bound 355109 291122 81.98
Torus 190213 103812 54.58
Torus Bound 190213 104002 54.68
True Type Font 817186 213488 26.12
Clipping Object 540887 478611 88.49
Bounding Box 2413335789 531789506 22.04
Function VM calls: 9498297
Roots tested: 18806406 eliminated: 1119
Shadow Ray Tests: 10683037 Succeeded: 47102
Shadow Cache Hits: 40562
Reflected Rays: 168931 Total Internal: 8718
Refracted Rays: 160895
Transmitted Rays: 5772324
Smallest Alloc: 12 bytes
Largest Alloc: 12 bytes
Total Alloc calls: 1 Free calls: 56312792
Render Time:
Photon Time: No photons
Radiosity Time: No radiosity
Trace Time: 0 hours 12 minutes 21 seconds (741.718 seconds)
using 1 thread(s) with 739.631 CPU-seconds total
POV-Ray finished
-
CPU time used: kernel 1.40 seconds, user 744.23 seconds, total 745.64
seconds.
Elapsed time 746.60 seconds.
Render averaged 2571.65 PPS (2574.98 PPS CPU time) over 1920000 pixels.
----------------------------------------------------------------------------
end stats for pvengine-sse2 beta 30 1 thread
_____________________________________________________________________________
begin stats for pvengine32-sse2 (radiosity-changed) beta 30 2 threads
----------------------------------------------------------------------------
Finite Objects: 7474
Infinite Objects: 0
Light Sources: 2
Total: 7476
Parser Time
Parse Time: 0 hours 0 minutes 0 seconds (0.000 seconds)
using 1 thread(s) with 1.591 CPU-seconds total
Bounding Time: 0 hours 0 minutes 0 seconds (0.047 seconds)
using 1 thread(s) with 0.046 CPU-seconds total
Render Options
Quality: 9
Bounding boxes.......On Bounding threshold: 3
Antialiasing.........On (Method 1, Threshold 0.400, Depth 2, Jitter Off)
Render Statistics
Image Resolution 1600 x 1200
Pixels: 2040800 Samples: 4629652 Smpls/Pxl: 2.27
Rays: 12772602 Saved: 1587744 Max Level: 1600/1200
Ray->Shape Intersection Tests Succeeded Percentage
Blob 200144 35757 17.87
Blob Component 167655 84980 50.69
Blob Bound 800576 167655 20.94
Box 895707 888096 99.15
Cone/Cylinder 10696292 9385215 87.74
CSG Intersection 10395 132 1.27
CSG Union 462067 174964 37.87
Polygon 143850 62685 43.58
Sphere 505290 382893 75.78
Sphere Sweep 18753062 540509 2.88
Surface of Revolution 355109 187896 52.91
Surface of Rev. Bound 355109 291122 81.98
Torus 190213 103812 54.58
Torus Bound 190213 104002 54.68
True Type Font 817132 213488 26.13
Clipping Object 540887 478611 88.49
Bounding Box 2413335438 531789436 22.04
Function VM calls: 9498297
Roots tested: 18806406 eliminated: 1119
Shadow Ray Tests: 10683037 Succeeded: 47102
Shadow Cache Hits: 40563
Reflected Rays: 168931 Total Internal: 8718
Refracted Rays: 160895
Transmitted Rays: 5772324
Smallest Alloc: 12 bytes
Largest Alloc: 12 bytes
Total Alloc calls: 2 Free calls: 56314983
Render Time:
Photon Time: No photons
Radiosity Time: No radiosity
Trace Time: 1 hours 36 minutes 33 seconds (5793.391 seconds)
using 2 thread(s) with 7003.306 CPU-seconds total
POV-Ray finished
-
CPU time used: kernel 267.56 seconds, user 6745.00 seconds, total 7012.56
seconds.
Elapsed time 5799.85 seconds, CPU vs elapsed time ratio 1.21.
Render averaged 331.04 PPS (273.79 PPS CPU time) over 1920000 pixels.
----------------------------------------------------------------------------
end stats for pvengine32-sse2 (radiosity-changed) beta 30 2 threads
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
> -----Original Message-----
> From: Bob Hughes [mailto:omniverse charter net]
> I hadn't checked any of the sample files before posting until now so I
> gave
> the scenes\advanced\benchmark.pov a try and found the render times to
> be
> like I would expect. Using 1 thread was slower at 26 minutes, while 2
> threads improves that to about 14 minutes; both SSE2's I tested it in,
> while
> non-SSE2 is similar yet slower overall of course.
That's about what I would expect going from 1 thread to 2. The Trace
portion of POV scales extremely well (almost ideally) with adding cores.
> For at least one of my own files (actually a pov with collection of
> .inc's)
> the render-time differences can be incredible, at the same output
image
> resolution just over 12 minutes using 1 thread in pvengine-sse2.exe,
> and
> about 96 minutes using 2 threads in pvengine32-sse2.exe (pvengine-
> sse2.exe
> beta 30 was 6 minutes faster, or 90 minutes, for whatever reason).
Is there any way you could create a minimal scene that demonstrates this
discrepancy? I'm rather interested in testing it myself, as I haven't
had any such problems.
If you can't minimize the scene, would you consider posting your file,
or a link to it? I can't guarantee how useful it would be without
knowing how large it is, of course.
...Ben Chambers
www.pacificwebguy.com
A render isn't slow unless it won't finish until after your next
birthday.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
"Chambers" <ben### [at] pacificwebguycom> wrote in message
news:147C633062104D78B2FA2555A7C26198@HomePC...
>> -----Original Message-----
>> For at least one of my own files (actually a pov with collection of
>> .inc's)
>> the render-time differences can be incredible, at the same output
> image
>> resolution just over 12 minutes using 1 thread in pvengine-sse2.exe,
>> and
>> about 96 minutes using 2 threads in pvengine32-sse2.exe (pvengine-
>> sse2.exe
>> beta 30 was 6 minutes faster, or 90 minutes, for whatever reason).
>
> Is there any way you could create a minimal scene that demonstrates this
> discrepancy? I'm rather interested in testing it myself, as I haven't
> had any such problems.
>
> If you can't minimize the scene, would you consider posting your file,
> or a link to it? I can't guarantee how useful it would be without
> knowing how large it is, of course.
Hey Ben, thanks for wanting to check on this but I haven't been able to get
anywhere with it yet.
Something I misspoke of before was saying "splines" when I meant
sphere_sweep (using cubic_spline).
I tried a separate test scene file with some objects (sor, blob,
sphere_sweep, text, sphere, cylinder) taken from the original problem
file(s), always renders with a speed-up using 2 threads. I can't seem to
narrow it down to anything but the original render slows considerably at 1/4
to 1/3 into it then remains slow, while the Work_Threads=1 helps it speed
along by comparison.
I still haven't found another file able to cause the same kind of slowdown
so it could be this one particular rendering I kept thinking of as being the
problem when changing thread count, except if one does it I'm sure others
must.
I ran a histogram in 3.6 to see what things would be using the most time
(not able to in 3.7 beta?), sphere_sweep and overlapping cylinders with
gradient pigment (some transparency) look the slowest. Not totally
unexpected at all.
Vista search indexing was all I could think of outside of POV being at
fault. I switched that off and nothing changed.
If I don't get anywhere soon I might email a link, if you wouldn't mind
that. It's a time-line data chart I don't really want out loose on the 'net
because of info used in it that would become outdated. But maybe it doesn't
matter... guess I just can't see posting it due to the specific nature of
it.
Bob
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
> -----Original Message-----
> From: Bob Hughes [mailto:omniverse charter net]
> I tried a separate test scene file with some objects (sor, blob,
> sphere_sweep, text, sphere, cylinder) taken from the original problem
> file(s), always renders with a speed-up using 2 threads. I can't seem
> to narrow it down to anything but the original render slows
considerably
> at 1/4 to 1/3 into it then remains slow, while the Work_Threads=1
helps it
> speed along by comparison.
One method I've used is to take a copy of the scene, and try removing
objects / simplifying textures one at a time, until the problem
disappears. If you can't recreate the problem any other way, this
usually works.
> If I don't get anywhere soon I might email a link, if you wouldn't
mind
> that. It's a time-line data chart I don't really want out loose on the
> 'net because of info used in it that would become outdated. But maybe
it
> doesn't matter... guess I just can't see posting it due to the
specific nature
> of it.
That's fine, whatever is appropriate for the data you have.
...Ben Chambers
www.pacificwebguy.com
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
I put the files at http://0mniverse.com/public/37threads2slow.zip so if
anyone else wants to check whether it renders much slower with
Work_Threads=2 in the 3.7 beta they could do so.
When emailing it to Ben I forgot about the clockmod.inc being used and you
would need to get that from http://www.geocities.com/ccolefax/clockmod.html
if you don't have it already. Also not sure about the fonts... everyone
might not have the ones I do and I didn't check which ones I used. If any
errors occur when attempting to render I think switching the fonts for
others shouldn't change rendertime.
You're welcome to use the files in part yourselves if you wish, if sense can
be made of my SDL, I just don't want the whole thing used for what I am
using it for since I'm putting the resultant graph online already. Doubt
that needed to be said...
Bob
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
By the way, I also had to create a dummy file "holiday symbols.inc" for
it to render.
...Ben Chambers
www.pacificwebguy.com
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Using Beta 30 x64, I got 70 seconds with one thread and 34 with two.
Using the sse2 version of the modified executable, I got 72 seconds with
one thread and 76 seconds with two!
Something is definitely going on here... I'll be playing with the scene
file...
...Ben Chambers
www.pacificwebguy.com
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
I don't have more time to work with it today, but I can definitely say
that the effect seen is related to the overlapping graphs and the
background. Eliminating sets of data or the background both have the
effect of reducing the effect.
...Ben Chambers
www.pacificwebguy.com
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |