POV-Ray: Newsgroups: povray.general: Isn't it time for a new pov benchmark?

POV-Ray : Newsgroups : povray.general : Isn't it time for a new pov benchmark?		Server Time 7 Aug 2024 19:20:37 EDT (-0400)

Goto Latest 10 Messages

Next 10 Messages >>>

From: Warp
Subject: Isn't it time for a new pov benchmark?
Date: 20 Aug 2001 06:06:24
Message: <3b80e120@news.povray.org>

The PovBench (http://www.haveland.com/povbench/) has been obsolete for
long time now (several years, I would say). The skyvase scene is not slow
enough to measure current fastest computers accurately, and besides, the
list is full of bogus entries. Even if the entry is genuine, two entries
with a rendering time of 3 seconds just doesn't tell which one is better or
if they are equal.
  Moreover, using one simple scene to benchmark POV-Ray is too restrictive:
It measures just a very small percentage of POV-Ray features (eg. it doesn't
measure heavy memory usage or parsing speed).

  Wouldn't it be time to make a renewed POV-Ray benchmark? A much better
benchmark up to the current computer speeds?
  This benchmark could have the following features:

  1. It has more than one scene file, each one benchmarking its own important
area in rendering, for example one for raw raytracing speed, one which takes
long to parse, one which uses lots of memory, etc.

  2. These scene files should take a reasonable amount of time in current
computers. This amount should be chosen so that it will not be in the order
of a couple of seconds in a few years. For example they could take about 10
minutes each in an 1.2GHz Athlon.

  3. The submission of entries should be controlled. Of course it's difficult
to see if someone is just making a bogus entry if the numbers are credible,
but there could be, for example, a "trusted entry system": Entries from
trusted sources could get a mark showing that it's a trusted entry. Entries
not having this mark may or may not be true (and this should be clearly
stated in the page).

-- 
#macro N(D,I)#if(I<6)cylinder{M()#local D[I]=div(D[I],104);M().5,2pigment{
rgb M()}}N(D,(D[I]>99?I:I+1))#end#end#macro M()<mod(D[I],13)-6,mod(div(D[I
],13),8)-3,10>#end blob{N(array[6]{11117333955,
7382340,3358,3900569407,970,4254934330},0)}//                     - Warp -

Post a reply to this message

From: Jérôme Grimbert
Subject: Re: Isn't it time for a new pov benchmark?
Date: 20 Aug 2001 06:57:21
Message: <3B80EDA5.409004B0@atosorigin.com>

Warp wrote:
> 
>   The PovBench (http://www.haveland.com/povbench/) has been obsolete for
> long time now (several years, I would say). The skyvase scene is not slow
> enough to measure current fastest computers accurately, and besides, the
> list is full of bogus entries. Even if the entry is genuine, two entries
> with a rendering time of 3 seconds just doesn't tell which one is better or
> if they are equal.
>   Moreover, using one simple scene to benchmark POV-Ray is too restrictive:
> It measures just a very small percentage of POV-Ray features (eg. it doesn't
> measure heavy memory usage or parsing speed).
> 
>   Wouldn't it be time to make a renewed POV-Ray benchmark? A much better
> benchmark up to the current computer speeds?
>   This benchmark could have the following features:
> 
>   1. It has more than one scene file, each one benchmarking its own important
> area in rendering, for example one for raw raytracing speed, one which takes
> long to parse, one which uses lots of memory, etc.

Then, you would have multiple metrics. IMNSHO, it's a bad thing, because
everyone will wants soon it's own test scene. 
You should rather stick to a single scene, which should nevertheless
be a complex-looking and lovely one.
Rendering an animation set of frame may be an option, but it might easily
open the pandora box with the #if/#switch statements.
Anyway, a full INI file should be provided with the scene,
and every possible setting should be explicitely done in that INI file.

BTW, the rendered size should be huge (minimum of 1600x1200), with AA, 
and the scene should not be biazed toward any colour.
(Skyvase is too much blue !)

> 
>   2. These scene files should take a reasonable amount of time in current
> computers. This amount should be chosen so that it will not be in the order
> of a couple of seconds in a few years. For example they could take about 10
> minutes each in an 1.2GHz Athlon.

That's way too short. 
It should take at least two hours on the 1.4GHz Athlon with enough memory.
Provision should be made in the test protocol to change the scene again
when the reported time had reached less than 3 minutes on commonly
available.

And it should reflect the particularities of POV (so that doing
the same picture in other software(triangle based) should be
very difficult with the same short code and the same smoothness).

> 
>   3. The submission of entries should be controlled. Of course it's difficult
> to see if someone is just making a bogus entry if the numbers are credible,
> but there could be, for example, a "trusted entry system": Entries from
> trusted sources could get a mark showing that it's a trusted entry. Entries
> not having this mark may or may not be true (and this should be clearly
> stated in the page).

Easier, unless one has decided to really forge it: have pov output
a CRC as well as the statistics, and only accept capture from
the pov-output :-> If the CRC does not match, reject the entry :-<

Really, faked entries are of no concern for commonly available systems:
When a lot of P2/400 MHz gives a render time of x, the x/30 entry is
obviously faked.

Maybe the benchmark could summarize per kind of processor, with
the min,max and average, so reader may know what to expect from
a given kind of processor. (If the Celeron 333 metrics is 
1s/79s/119s , then there is obviously something strange, because a 
ratio of more than 100 for the same processor is difficult to explain
by the only change of the OS)

Personaly, I won't bother to have trusted/untrusted entries, nor
even to have any checking on value.

Post a reply to this message

From: Christoph Hormann
Subject: Re: Isn't it time for a new pov benchmark?
Date: 20 Aug 2001 08:06:10
Message: <3B80FDBA.D4B09876@gmx.de>

That's a good idea, especially the idea of a whole benchmark suite, but
quite tricky on the other hand.  For example radiosity is still considered
as experimantal and differs between 3.1 and 3.5.  A benchmark would not be
complete without radiosity (since the intensive use of memory in a
radiosity scene is very important for performance) but you would not be
able to compare results from Povray 3.1 and 3.5.  Porting Povray 3.5 to
all platforms 3.1 currently runs on will take quite some time if it's
possible at all (since a C++ compiler is needed)

So what i would suggest is a general Povray 3.1 compatible benchmark scene
that's slower than skyvase and uses reasonable amount of memory.  

For an additional benchmark suite i would also use new features (and
radiosity) but only record results achieved with Povray 3.5 based compiles
since even if it's possible to render things with megapov results won't be
comparable.

Some conceptions for benchmark scenes i would suggest:

- reflection scene (low memory use, but slow because of highly complicated
reflections)
- complex csg scene (high memory use)
- radiosity scene (high memory use and very random access to memory)
- isosurface scene (testing performance in function evaluation)
- a large mesh scene (preferably macro generated, otherwise a large scene
file would be needed)
- a texture scene (basic geomentry but slow textures, would probably be
difficult to get this slow enough)

Christoph

-- 
Christoph Hormann <chr### [at] gmxde>
IsoWood include, radiosity tutorial, TransSkin and other 
things on: http://www.schunter.etc.tu-bs.de/~chris/

Post a reply to this message

From: Warp
Subject: Re: Isn't it time for a new pov benchmark?
Date: 20 Aug 2001 08:44:38
Message: <3b810636@news.povray.org>


: Then, you would have multiple metrics. IMNSHO, it's a bad thing, because
: everyone will wants soon it's own test scene. 
: You should rather stick to a single scene, which should nevertheless
: be a complex-looking and lovely one.

  I'm not sure if I understand what do you mean.

  The idea with several scenes is that they measure different aspects of
the raytracing process. Of course the results of each scene should be
shown separately, although the total score could be the total sum of times.
  Of course the rendering time of all the scenes should be approximately
equal so that the results don't get biased towards one of them (ie. the one
which takes longer to render).
  There could be at least 4 different test files:

  1) A test for parsing speed: A pov-file which takes most of the time in
the parsing stage (doing something useful, not just idle loops) and a minimal
amount of time in the rendering part. This could be, for example, a scene
which places some thousands of spheres according to some very complicated
algorithm. The scene could take some memory at parsing time, but not at
render time.

  2) A test for heavy memory usage. The scene should take a considerable
amount of memory which is actively used while rendering. This could be
achieved with lots of light sources, huge amount of objects and so on.

  3) A test for raw raytracing speed: Parsing time and memory usage should
be negligible (ie. parsing time just a few seconds at max. and the scene
doesn't take any memory), but the raytracing part takes most of the time.

  4) A scene which combines all of the three.

  As I said, the total time of all the four scenes should be approximately
equal so that the results don't get biased.

  The idea behind this is that some computers are better in one of those than
in another, and this kind of test gives a good idea where is that computer
good at.
  One single scene can't test all of those things, and even if it does
(as in the 4th example), one can't see how well the computer performs in
the individual tasks.

: That's way too short. 
: It should take at least two hours on the 1.4GHz Athlon with enough memory.

  I think two hours is a bit overkill. Granted, 10 minutes may be too short,
but I think 2 hours is too much. I don't think people want to wait for hours
for this.
  Or perhaps if 4 pov-files are used, the total time could be about 2 hours,
which would mean a half-hour per file.

: Easier, unless one has decided to really forge it: have pov output
: a CRC as well as the statistics, and only accept capture from
: the pov-output :-> If the CRC does not match, reject the entry :-<

  That kind of CRC can be easily faked.

: Really, faked entries are of no concern for commonly available systems:
: When a lot of P2/400 MHz gives a render time of x, the x/30 entry is
: obviously faked.

  Of course, but I suppose that many people have the temptation to take away
a small but unnoticeable amount from the real rendering times (eg. if the
real rendering time was 35 minutes, they could just report 31 minutes and
no-one will notice). Unfortunately many people are like this; even if they
don't get anything from that (not even their name anywhere), they still tend
to "exaggerate" a bit to look better.

: Personaly, I won't bother to have trusted/untrusted entries, nor
: even to have any checking on value.

  Then there will just be a lot of bogus entries as in the current pov-bench,
which effectively destroys its usefulness.
  Even with the "average" thinking, absolutely no checking will still cause
problems. If the real average for a certain CPU would be 1 hour and someone
reports 20 times a rendering time of 1 second, that produces a substantial
change in the average.
  So there should definitely be some kind of checking. And I still think that
the "trusted source" method is good.

-- 
#macro N(D,I)#if(I<6)cylinder{M()#local D[I]=div(D[I],104);M().5,2pigment{
rgb M()}}N(D,(D[I]>99?I:I+1))#end#end#macro M()<mod(D[I],13)-6,mod(div(D[I
],13),8)-3,10>#end blob{N(array[6]{11117333955,
7382340,3358,3900569407,970,4254934330},0)}//                     - Warp -

Post a reply to this message

From: Ken
Subject: Re: Isn't it time for a new pov benchmark?
Date: 20 Aug 2001 08:53:59
Message: <3B810900.989186FB@pacbell.net>

Christoph Hormann wrote:

> - radiosity scene (high memory use and very random access to memory)

High memory use would be bad. You can't know how much memory every
computer has installed and if someone runs out of memory, and has
to resort to disk swapping, it will negate their render time as
accurate. There are still people running machines with much less
than 128 megs and even those people with 128 megs installed have
the OS using up much of those resources.

I suggest the scene not use much more than 30 megs or so of memory.

-- 
Ken Tyler

Post a reply to this message

From: Ken
Subject: Re: Isn't it time for a new pov benchmark?
Date: 20 Aug 2001 08:56:58
Message: <3B8109B5.41EE184@pacbell.net>

Warp wrote:
> 

> : Then, you would have multiple metrics. IMNSHO, it's a bad thing, because
> : everyone will wants soon it's own test scene.
> : You should rather stick to a single scene, which should nevertheless
> : be a complex-looking and lovely one.
> 
>   I'm not sure if I understand what do you mean.

There should be only one scene file. You are making it much too
complicated.

-- 
Ken Tyler

Post a reply to this message

From: Christoph Hormann
Subject: Re: Isn't it time for a new pov benchmark?
Date: 20 Aug 2001 09:18:42
Message: <3B810EBC.746ACEBF@gmx.de>

Ken wrote:
> 
> High memory use would be bad. You can't know how much memory every
> computer has installed and if someone runs out of memory, and has
> to resort to disk swapping, it will negate their render time as
> accurate. There are still people running machines with much less
> than 128 megs and even those people with 128 megs installed have
> the OS using up much of those resources.
> 
> I suggest the scene not use much more than 30 megs or so of memory.

That's about what i thought, the main emphasis should lie on testing the
speed of memory access and having a scene of only 400k will not fulfil
this purpose (since most of the data will fit in cache on many systems).

Christoph

-- 
Christoph Hormann <chr### [at] gmxde>
IsoWood include, radiosity tutorial, TransSkin and other 
things on: http://www.schunter.etc.tu-bs.de/~chris/

Post a reply to this message

From: Jérôme Grimbert
Subject: Re: Isn't it time for a new pov benchmark?
Date: 20 Aug 2001 09:32:27
Message: <3B8111FD.BA95FF6D@atosorigin.com>

Ken wrote:
> 
> Warp wrote:
> >

> > : Then, you would have multiple metrics. IMNSHO, it's a bad thing, because
> > : everyone will wants soon it's own test scene.
> > : You should rather stick to a single scene, which should nevertheless
> > : be a complex-looking and lovely one.
> >
> >   I'm not sure if I understand what do you mean.
> 
> There should be only one scene file. You are making it much too
> complicated.

Exactly, Ken :-)

The scene file should produce a beautiful image
          (so as to astonish the new users),
And it should be long enough to parse & render (I stick to my 2 hours on 
 a 1.4GHz Athlon (*))
  (at least, it should start to teach patience** to the new users,
 hence, it must be a beautiful image which looks complicated)

*: remember, it's 2 hours for a huge size of at least 1600x1200 with AA.
*, also: when a dual-athlon with 2GHz each will be available
  (dual athlon is already, 2GHz probably in a few months/years), hoping 
  a port of povray 3.5+ (4.0 ??) will be multiprocessor compatible 
 (using the full power, with thread and so), the LONG 2Hr would be reduced
 to only 40 minutes,
and on a server with quad-mobo would only take 20 minutes...
Try using a beowolf cluster of 64 of unexpensive 1GHz (by that time), 
and the theorical rendering drop to less than 3 minutes !
If you do the computation with a shorter scene, you will found yourself
with a precision problem sooner than you would have expected.

** I must confess, I started with DKBtrace 2.12 a long time ago 
and it took more than one week to render ntreal on a poor CPU (without FPU).
(no harddisk either, all was on floppy !). It was a time when one had
better to thing the scene rather than trying multiple preview.
Mosaic preview was a great thing then!

Post a reply to this message

From: Ron Parker
Subject: Re: Isn't it time for a new pov benchmark?
Date: 20 Aug 2001 09:51:03
Message: <slrn9o25e8.ft8.ron.parker@fwi.com>


>And it should be long enough to parse & render (I stick to my 2 hours on 
> a 1.4GHz Athlon (*))

But that would prevent people with real-world processors from participating
at all.  I hate to think how long it would take this P200 to run the 
benchmark, for example, but even a Duron 650 would have a tough time of it,
and that's a current processor.  I'm sure I'm not the only one who doesn't
expect to break the gigahertz barrier for at least another year or two.

-- 
#macro R(L P)sphere{L F}cylinder{L P F}#end#macro P(V)merge{R(z+a z)R(-z a-z)R(a
-z-z-z a+z)torus{1F clipped_by{plane{a 0}}}translate V}#end#macro Z(a F T)merge{
P(z+a)P(z-a)R(-z-z-x a)pigment{rgbf 1}hollow interior{media{emission 3-T}}}#end 
Z(-x-x.2x)camera{location z*-10rotate x*90normal{bumps.02scale.05}}

Post a reply to this message

From: Adrien Beau
Subject: Re: Isn't it time for a new pov benchmark?
Date: 20 Aug 2001 10:05:29
Message: <3B81192F.E55C2036@sycomore.fr>

Warp wrote:
> 
>   2. These scene files should take a reasonable amount of time in current
> computers. This amount should be chosen so that it will not be in the order
> of a couple of seconds in a few years. For example they could take about 10
> minutes each in an 1.2GHz Athlon.

Not enough, as others noted. I remember seeing entries like
several minutes for early Pentiums (that's the last time I
checked the bench pages) and several hours for 386.

I think nowadays it would be half an hour or more for around
1 GHz processors.

-- 
Adrien Beau - adr### [at] freefr - http://adrien.beau.free.fr
 Mes propos n'engagent que moi et en aucun cas mes employeurs

Post a reply to this message

Goto Latest 10 Messages

Next 10 Messages >>>