|
|
On 07/03/2013 09:42 AM, Fractracer wrote:
> I re-render the scene without medias and the trace time are also very different
> (37 secondes with 3.6 and 99 sec with 3.7). Maybe medias are not in cause, but
> objects with hollow. And the pictures are different in shining.
>
I've not seen differences as extreme as what you are reporting above,
but I have seen increases more like your first numbers for CPU time
consumed.
With multi-core, mult-thread machines it is hard to determine actual
code performance unless you can keep thread counts < CPU core counts for
>>everything<< running significant cycles on your machine.
I took a look using the shipped scene hollow.pov to try something
similar but different to your scene on ubuntu 12.04 using the unix time
command. With the time command the user time is very close to povray
time reported. Also, below "povray" is 3.7 RC7.
time povray361 hollow2.pov +W1200 +H1200 +A -D
real 0m23.470s (Elapsed time)
user 0m23.401s (This plus sys ~= CPU time)
sys 0m0.004s
time povray hollow2.pov +W1200 +H1200 +A -D -WT1
real 0m21.775s
user 0m21.297s (1 thread 3.7 is actually faster than 3.61)
sys 0m0.044s
time povray hollow2.pov +W1200 +H1200 +A -D -WT8
real 0m5.632s
user 0m33.806s (8 threads, more like the CPU increase 1st reported)
sys 0m0.048s
------------------
If I remove hollow & interior from spheres, I get the following times
looking first at 3.61 then 3.7rc7 from 1 to 8 threads running on a 4
core, 8 thread i7 920:
time povray361 hollow2_A.pov +W1200 +H1200 +A -D
real 0m6.803s
user 0m6.780s
sys 0m0.000s
time povray hollow2_A.pov +W1200 +H1200 +A -D -WT1
real 0m8.596s
user 0m8.129s (1 thread here a little slower)
sys 0m0.052s
time povray hollow2_A.pov +W1200 +H1200 +A -D -WT2
real 0m4.985s
user 0m8.309s
sys 0m0.052s
time povray hollow2_A.pov +W1200 +H1200 +A -D -WT3
real 0m3.733s
user 0m8.281s
sys 0m0.064s
time povray hollow2_A.pov +W1200 +H1200 +A -D -WT4
real 0m3.087s
user 0m8.345s
sys 0m0.048s
time povray hollow2_A.pov +W1200 +H1200 +A -D -WT5
real 0m2.986s
user 0m9.773s (Once thread > core count, CPU jumps)
sys 0m0.060s
time povray hollow2_A.pov +W1200 +H1200 +A -D -WT6
real 0m2.937s
user 0m11.069s (jumps some more)
sys 0m0.048s
time povray hollow2_A.pov +W1200 +H1200 +A -D -WT7
real 0m2.890s
user 0m12.309s (and more)
sys 0m0.048s
time povray hollow2_A.pov +W1200 +H1200 +A -D -WT8
real 0m2.793s
user 0m13.241s (...)
sys 0m0.052s
-------------------
The % increase in total CPU consumed with 1 vs 8 threads is about the
same with and without hollow/interior/media for the spheres.
1 -> 8 threads with hollow/media +58.74%
1 -> 8 threads no hollow/media +62.89%
If I run instead an isosurface scene :
1 vs 4 threads -1.26% CPU consumed.
1 vs 8 threads +30.87% CPU consumed.
Despite CPU consumed increasing when threads > CPU cores, the 8 thread
cases provide the best elapsed time in 3.7.
There are many other factors in caches, compilers, memory bandwidth, CPU
architecture, code changes etc in addition to thread count relative to
core count, that can affect CPU time consumed. Expect one or more of
those issues is why the % increase in consumed CPU is different scene to
scene once threads > core count.
Sorry I got long winded. Mostly wanted to say if wondering about 3.6 to
3.7 relative performance, my recommendation would be to always use one
thread in 3.7 on a machine where you are sure the thread count < core
count for all active processes over the elapsed time povray is running.
If you have that, everything else that can be the same is, and you still
see a significant and repeatable performance difference - perhaps the
difference is meaningful (1).
Hope some help.
Bill P.
(1) - Yes, it is possible to have performance issues related to the SMP
of 3.7, but I'd bet them not that common. Even when present they should
show up using thread counts > 1 and <= the CPU core count on the
machine. In other words, it would still make sense to take the thread >
core CPU increase out of any comparison.
Post a reply to this message
|
|