|
![](/i/fill.gif) |
Warp <war### [at] tag povray org> wrote:
> Mienai <Mienai> wrote:
> > Actual benchmarks have shown that if you run 2 instances of POVRay on a
> > hyperthreading machine (one instance on each logical processor, each
> > rendering half your image) you will have the completed image in around half
> > the time (assuming the two halfs take about the same time to render).
>
> I find that hard to believe. Either that is not true, or the claim
> that two processes can't use the FPU at the same time is not true.
>
> If I'm not mistaken, one POV-Ray thread could perform integer math
> at the same time the other POV-Ray thread is performing FPU math. But
> when the first one needs the FPU it has to wait for the second. Since
> POV-Ray uses the FPU quite heavily, I find it quite hard to believe
> that running it in two threads would drop the rendering time to half
> (unless the P4 *really* can run two FPU threads at the same time).
> I am ready to believe that the total rendering time drops by
> some percentage (because POV-Ray naturally performs other operations
> than just FPU opcodes, naturally), but I would be surprised if this
> percentage would be anything close to 50%.
> If it really is close to 50%, then someone has to explain me how
> is that possible.
>
> --
> #macro M(A,N,D,L)plane{-z,-9pigment{mandel L*9translate N color_map{[0rgb x]
> [1rgb 9]}scale<D,D*3D>*1e3}rotate y*A*8}#end M(-3<1.206434.28623>70,7)M(
> -1<.7438.1795>1,20)M(1<.77595.13699>30,20)M(3<.75923.07145>80,99)// - Warp -
So I ran the POVRay benchmark today on my P4 system and here's the results:
single thread running entire benchmark:
average 71 PPS in 0d 00h 34m 35s
two simulataneous threads, each running half benchmark (verticle split):
thread 1: average 52 PPS in 0d 00h 09m 45s
thread 2: average 48 PPS in 0d 00h 10m 49s
two simulataneous threads, each running half benchmark (verticle split),
photons precalculated:
thread 1: average 60 PPS in 0d 00h 08m 27s
thread 2: average 54 PPS in 0d 00h 09m 30s
single thread, hyperthreading disabled:
average 74 PPS in 0d 00h 33m 05s
So looking at those results it would appear that I was wrong, that it's
actually closer to a third the time. Speaking from experience though I
generally find that it's closer to 50% on most the larger files (a 16hr
render taking closer to 8hr than 5). I took a graduate class on super
scalar architecture last year but we didn't talk about FPU's specifically
much but if I had to guess it has to do with the way it's pipelined, plus
if I remember right the FPU is opperated twice as fast as the CPU core (it
can do a calculation every half tic). I don't know how much you know about
the subject but pipelining increases efficiency, it sucks for doing small
opperations but when your doing a whole series of operations it kicks ass.
You can start the next before the first is complete. I hope that answers
your questions, if you have more feel free to ask.
If you do something like this I highly recommend precalculating photons and
loading the map file so you don't have to calculate it for each thread you
run (decreases overhead)
Post a reply to this message
|
![](/i/fill.gif) |