POV-Ray : Newsgroups : povray.advanced-users : medias with pov3.7 : Re: medias with pov3.7 Server Time
28 Sep 2024 09:07:27 EDT (-0400)
  Re: medias with pov3.7  
From: William F Pokorny
Date: 3 Jul 2013 14:33:47
Message: <51d46e8b$1@news.povray.org>
On 07/03/2013 09:42 AM, Fractracer wrote:

> I re-render the scene without medias and the trace time are also very different
> (37 secondes with 3.6 and 99 sec with 3.7). Maybe medias are not in cause, but
> objects with hollow. And the pictures are different in shining.
>

I've not seen differences as extreme as what you are reporting above, 
but I have seen increases more like your first numbers for CPU time 
consumed.

With multi-core, mult-thread machines it is hard to determine actual 
code performance unless you can keep thread counts < CPU core counts for 
 >>everything<< running significant cycles on your machine.

I took a look using the shipped scene hollow.pov to try something 
similar but different to your scene on ubuntu 12.04 using the unix time 
command. With the time command the user time is very close to povray 
time reported. Also, below "povray" is 3.7 RC7.

time povray361 hollow2.pov +W1200 +H1200 +A -D
real    0m23.470s (Elapsed time)
user    0m23.401s (This plus sys ~= CPU time)
sys     0m0.004s

time povray hollow2.pov +W1200 +H1200 +A -D -WT1
real    0m21.775s
user    0m21.297s (1 thread 3.7 is actually faster than 3.61)
sys     0m0.044s

time povray hollow2.pov +W1200 +H1200 +A -D -WT8
real    0m5.632s
user    0m33.806s (8 threads, more like the CPU increase 1st reported)
sys     0m0.048s

------------------
If I remove hollow & interior from spheres, I get the following times 
looking first at 3.61 then 3.7rc7 from 1 to 8 threads running on a 4 
core, 8 thread i7 920:

time povray361 hollow2_A.pov +W1200 +H1200 +A -D
real    0m6.803s
user    0m6.780s
sys     0m0.000s

time povray hollow2_A.pov +W1200 +H1200 +A -D -WT1
real    0m8.596s
user    0m8.129s (1 thread here a little slower)
sys     0m0.052s

time povray hollow2_A.pov +W1200 +H1200 +A -D -WT2
real    0m4.985s
user    0m8.309s
sys     0m0.052s

time povray hollow2_A.pov +W1200 +H1200 +A -D -WT3
real    0m3.733s
user    0m8.281s
sys     0m0.064s

time povray hollow2_A.pov +W1200 +H1200 +A -D -WT4
real    0m3.087s
user    0m8.345s
sys     0m0.048s

time povray hollow2_A.pov +W1200 +H1200 +A -D -WT5
real    0m2.986s
user    0m9.773s (Once thread > core count, CPU jumps)
sys     0m0.060s

time povray hollow2_A.pov +W1200 +H1200 +A -D -WT6
real    0m2.937s
user    0m11.069s (jumps some more)
sys     0m0.048s

time povray hollow2_A.pov +W1200 +H1200 +A -D -WT7
real    0m2.890s
user    0m12.309s (and more)
sys     0m0.048s

time povray hollow2_A.pov +W1200 +H1200 +A -D -WT8
real    0m2.793s
user    0m13.241s  (...)
sys     0m0.052s

-------------------
The % increase in total CPU consumed with 1 vs 8 threads is about the 
same with and without hollow/interior/media for the spheres.

1 -> 8 threads with hollow/media +58.74%
1 -> 8 threads no   hollow/media +62.89%

If I run instead an isosurface scene :
1 vs 4 threads  -1.26% CPU consumed.
1 vs 8 threads +30.87% CPU consumed.

Despite CPU consumed increasing when threads > CPU cores, the 8 thread 
cases provide the best elapsed time in 3.7.

There are many other factors in caches, compilers, memory bandwidth, CPU 
architecture, code changes etc in addition to thread count relative to 
core count, that can affect CPU time consumed. Expect one or more of 
those issues is why the % increase in consumed CPU is different scene to 
scene once threads > core count.

Sorry I got long winded. Mostly wanted to say if wondering about 3.6 to 
3.7 relative performance, my recommendation would be to always use one 
thread in 3.7 on a machine where you are sure the thread count < core 
count for all active processes over the elapsed time povray is running. 
If you have that, everything else that can be the same is, and you still 
see a significant and repeatable performance difference - perhaps the 
difference is meaningful (1).

Hope some help.
Bill P.

(1) - Yes, it is possible to have performance issues related to the SMP 
of 3.7, but I'd bet them not that common. Even when present they should 
show up using thread counts > 1 and <= the CPU core count on the 
machine. In other words, it would still make sense to take the thread > 
core CPU increase out of any comparison.


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.