|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Go here:
http://www.geocities.com/bdchambers79/pov_test/beta11c.html
I thought this scene would be useful for developers, as it demonstrates
a couple of things.
1) The scene renders much too brightly in version 3.7
2) Speed in 3.7 is abysmal. 3.6 renders it at 800x450 in just under 2
minutes, and 3.7 takes just over 15.
In both cases, I'm using identical ini-file settings.
The 3.7 version is set to use /threads 4 (I've tried /threads 1, and no
setting {16 threads default?}, to no avail). I'm on a Pentium 4 2.8gHz,
no Hyperthreading support so I don't expect the threading to help too
much. However, I would expect the sse2 build to increase performance.
The main problems seem to be the refracting chess pieces, where
performance slows to a crawl. I'm going to go out on a limb here,
because I don't know the internals much, except for what I've read on
these newsgroups, but here's my guess:
I read previously that 3.7 now implements a document / view model, where
the render is essentially a view of the document (the scene). When a
translucent object is traced, it spawns many more rays which must query
the scene, thus a large amount of interprocess communication. In 3.6,
this isn't a problem. 3.7, however, must spawn messages for each ray,
creating a huge amount of overhead when many reflective / refractive
objects are being traced.
Although, I've been wrong many times before, and I wouldn't be surprised
if I were wrong again right now :)
...Chambers
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Chambers wrote:
> The main problems seem to be the refracting chess pieces, where
> performance slows to a crawl. I'm going to go out on a limb here,
> because I don't know the internals much, except for what I've read on
> these newsgroups, but here's my guess:
The guess is not correct because _your_ "model" of what POV-Ray does is not
correct.
Unfortunately, you did not provide an INI file, nor information if your data
is for the whole animation of just one frame. There could also be numerous
other things wrong, and it would also not be the first time 3.7 just assumes
slightly diffrent default settings 8or uses one incorrectly). The major
problem though is that your scene is just to complex to be of much use for
saying what could be wrong by just looking at it.
Thorsten
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Thorsten Froehlich wrote:
> Chambers wrote:
>> The main problems seem to be the refracting chess pieces, where
>> performance slows to a crawl. I'm going to go out on a limb here,
>> because I don't know the internals much, except for what I've read on
>> these newsgroups, but here's my guess:
>
> The guess is not correct because _your_ "model" of what POV-Ray does is
> not correct.
Good to know.
> Unfortunately, you did not provide an INI file,
Use quickres.ini, it comes standard with POV-Ray.
> nor information if your
> data is for the whole animation of just one frame.
Sorry, I assumed that when you read "Command-line settings are: +w800
+h450", you would realize that I was giving *all* command-line settings.
In other words, just the first frame (clock=0).
> There could also be
> numerous other things wrong, and it would also not be the first time 3.7
> just assumes slightly diffrent default settings 8or uses one
> incorrectly). The major problem though is that your scene is just to
> complex to be of much use for saying what could be wrong by just looking
> at it.
>
> Thorsten
Well, I didn't expect it to be a "smoking gun" or anything, I just
thought it would be an interesting example of 3.7 not behaving as expected.
Again, what really surprised me is that the sse2 build, running with a
single thread, was still about 7-8 times slower than 3.6.
...Chambers
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Thorsten Froehlich wrote:
> The major problem though is that your scene is just to
> complex to be of much use for saying what could be wrong by just looking
> at it.
>
> Thorsten
OK, here is a much simpler scene which demonstrates a similar speed
difference between 3.6 and 3.7. Both again rendered on a P4 2.8 gHz, no
antialiasing, this time at 512x384.
This scene constructs a number of concentric spheres in the middle of
the scene, with full transmittence, and varying IOR values. Surrounding
the scene is a box with the default checker colors of blue and green.
Ambient light is set to one, to avoid light computations.
This scene demonstrates the devastating effect that refraction has on
the speed of 3.7 versus 3.6. In fact, it seems to be around 8 times
slower! I would venture another guess that reflection could have the
same effect on a scene's rendering speed, but I have not tested for this
yet.
In the beginning of the scene, I list times for various recursion
levels. The first time is for 3.6, the second time for 3.7, and
following them are a ratio of 3.7 to 3.6.
The actual times aren't important, but the ratio of times. I fully
expected multithreading to involve some overhead, making it slightly
slower on a single-processor machine (w/o hyperthreading), but 8x
slower? Something is obviously wrong here.
(I also tested w/ light buffers and vista buffers turned off in 3.6, but
the effect on the ratio was negligible).
...Chambers
--- Cut here ---
#declare recursion = 5;
// Recursion 1: 1s; 4s; 4x
// Recursion 5: 2s; 15s; 7.5x
// Recursion 10: 4s; 29s; 7.25x
// Recursion 20: 9s; 1m 15s (75s); 8.3x
// Recursion 50: 37s; 5m 9s (309s); 8.35x
global_settings {
max_trace_level 256
ambient_light 1.0
}
camera {
location -z*19
look_at 0
}
box {
-20, 20
pigment {checker}
finish {ambient 1}
}
#declare sp = sphere {
0, 8
texture {
pigment {color transmit 1}
}
}
#declare c = 0;
#while (c < recursion)
object {sp interior{ior (recursion+c+1)/(recursion+c)} scale
(recursion-c)/(recursion)}
#declare c = c + 1;
#end
--- Cut here ---
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Chambers <bdc### [at] yahoocom> wrote:
> I fully
> expected multithreading to involve some overhead, making it slightly
> slower on a single-processor machine
A bit off-topic to this particular thread, but multithreading on a
single-processor computer does not necessarily mean slower execution
(even if by a really small margin). In fact, it may even mean slightly
faster execution.
Multi-threaded applications which perform calculations and lots of
disk I/O might actually considerably benefit from multithreading even
in a single-processor computer. This is because while one thread is
waiting for disk I/O to complete, another thread can continue using
the CPU. With one single thread the process would be stuck during I/O.
In a heavily CPU-oriented application such as POV-Ray you probably
won't get basically any speedup. The only case where there might be
a measurable difference is when rendering big animation frames which
render at a very high speed and which are saved in an uncompressed
file format (such as bmp).
I'm not saying this is so (without actual measurements it's
impossible to say), but in theory it could be possible.
So the problem you are experiencing is quite clearly related to
something else. Perhaps the new code has some big overhead in
recursive tracing calls? I can't say.
However, since it's (most probably) nothing related to multithreading,
it can probably be fixed.
--
- Warp
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Chambers wrote:
> OK, here is a much simpler scene which demonstrates a similar speed
> difference between 3.6 and 3.7. Both again rendered on a P4 2.8 gHz, no
> antialiasing, this time at 512x384.
thanks for the report, problems like this are one of the issues we still
need to chase down - fortunately it's just a bug and nothing fundamental
(there is no fundamental reason for 3.7 to be significantly* slower than
3.6, so in those cases where it is it's almost always either a bug or a
area that we haven't finished working on).
-- Chris
* there may be some areas that will show a slight slowdown on single-CPU
non-hyperthreaded systems even once we've finished, as there were e.g.
a number of places in the old code where shortcuts such as static caches
were used that are incompatible with SMP operation, and in some cases
the replacement may have more overhead. however we have some other plans
to speed up the code elsewhere that ought to more than compensate for
any such cases.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
> A bit off-topic to this particular thread, but multithreading on a
> single-processor computer does not necessarily mean slower execution
> (even if by a really small margin). In fact, it may even mean slightly
> faster execution.
>
> Multi-threaded applications which perform calculations and lots of
> disk I/O might actually considerably benefit from multithreading even
> in a single-processor computer. This is because while one thread is
> waiting for disk I/O to complete, another thread can continue using
> the CPU. With one single thread the process would be stuck during I/O.
...which is, of course, the entire reason why multitasking operating
systems were originally invented. :-)
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Chris Cason wrote:
> Chambers wrote:
>> OK, here is a much simpler scene which demonstrates a similar speed
>> difference between 3.6 and 3.7. Both again rendered on a P4 2.8 gHz, no
>> antialiasing, this time at 512x384.
>
> thanks for the report, problems like this are one of the issues we still
> need to chase down - fortunately it's just a bug and nothing fundamental
> (there is no fundamental reason for 3.7 to be significantly* slower than
> 3.6, so in those cases where it is it's almost always either a bug or a
> area that we haven't finished working on).
BTW, here is another scene which attempts to replicate the problem using
reflection. Not only does it show that reflection is not significantly
slower in 3.7, in some cases reflection is even faster in 3.7 than 3.6!
So the bug is not directly related to recursive calls, but only to
transmittance.
...Chambers
Post a reply to this message
Attachments:
Download 'us-ascii' (1 KB)
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Chambers wrote:
> BTW, here is another scene which attempts to replicate the problem using
> reflection. Not only does it show that reflection is not significantly
> slower in 3.7, in some cases reflection is even faster in 3.7 than 3.6!
> So the bug is not directly related to recursive calls, but only to
> transmittance.
>
> ...Chambers
It gets even better. I went back to the scene with the concentric
spheres, and removed the IOR (so it's a bunch of transparent spheres,
and nothing else).
Here are the old and new timings:
// Values using IOR
// Recursion 1: 1s; 4s; 4x
// Recursion 5: 2s; 15s; 7.5x
// Recursion 10: 4s; 29s; 7.25x
// Recursion 20: 9s; 1m 15s (75s); 8.3x
// Recursion 50: 37s; 5m 9s (309s); 8.35x
// Values NOT using IOR
// Recursion 1: 0s; 1s; -
// Recursion 5: 2s; 2s; 1x
// Recursion 10: 4s; 5s; 1.2x
// Recursion 20: 9s; 10s; 1.1x
// Recursion 50: 44s; 44s; 1x
As you can see, the 8x speed decrease is only observed when refraction
(IOR other than 1.0) is used. When no IOR is used, the speed is
virtually identical.
...Chambers
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Chambers wrote:
> As you can see, the 8x speed decrease is only observed when refraction
> (IOR other than 1.0) is used. When no IOR is used, the speed is
> virtually identical.
Ah, interesting. This is really useful information and will make tracking
this down a lot easier. Thank you very much!!!
Thorsten
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
|
|