POV-Ray: Newsgroups: povray.unix: povbench

POV-Ray : Newsgroups : povray.unix : povbench		Server Time 14 Jul 2025 18:54:17 EDT (-0400)

<<< Previous 10 Messages

Goto Initial 10 Messages

From: Thomas Willhalm
Subject: Re: povbench
Date: 12 Jun 2002 10:30:50
Message: <3d075b1a@news.povray.org>

Thorsten Froehlich wrote:

> In article <3d05de17@news.povray.org> , Thomas Willhalm
> <tho### [at] uni-konstanzde>  wrote:
> 
>> icc 6 IV    -O3 -tpp7  -xW -unroll -ip
> 
> How about adding any one of these: "-ipo", "-wp_ipo", "-prefetch", "-rcd"?

For some strange reason, icc (version 6) doesn't recognize the option 
"-prefetch" although it is listed in the documentation.

If I compile with "-rcd", I get a segmentation fault when I try to run the 
program. This doesn't imply that icc is buggy but it may also be the case 
that some of the experimental code in megapovplus is not clean.

So, I've compiled it with -wp_ipo and should be able to post the result 
tomorrow morning.

By the way, I tried to avoid dependencies from the OS provided libraries by 
switching off the display and file output.

Thomas

Post a reply to this message

From: Thomas Willhalm
Subject: Re: povbench
Date: 12 Jun 2002 10:36:05
Message: <3d075c55@news.povray.org>

Ole Laursen wrote:

> "Thorsten Froehlich" <tho### [at] trfde> writes:
>> In article <87l### [at] bachcomposers> , Ole Laursen
>> 
>> This is under Linux, of course.  It may just be a library issue that
>> doesn't
>> exist under Windows.  gcc surely is not such a good compiler...
> 
> This is povray.unix, so who cares how ICC performs on Windows? Get a
> real operating system. :-)

Didn't Thorsten work on a Mac, anyway? :-)
 
> Anyway, AFAIK ICC doesn't include a C library, so they use the same
> libraries.

There are some dependencies:
$ ldd megapovplus
        libvgagl.so.1 => /usr/lib/libvgagl.so.1 (0x40026000)
        libvga.so.1 => /usr/lib/libvga.so.1 (0x40035000)
        libz.so.1 => /lib/libz.so.1 (0x4008d000)
        libpng.so.2 => /usr/lib/libpng.so.2 (0x4009c000)
        libm.so.6 => /lib/libm.so.6 (0x400ce000)
        libX11.so.6 => /usr/X11R6/lib/libX11.so.6 (0x400f0000)
        libcxa.so.1 => /opt/intel/compiler60/ia32/lib/libcxa.so.1 
(0x401b1000)
        libc.so.6 => /lib/libc.so.6 (0x4021f000)
        libdl.so.2 => /lib/libdl.so.2 (0x40345000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)

Thomas

Post a reply to this message

From: Alessandro Coppo
Subject: Re: povbench
Date: 12 Jun 2002 16:12:18
Message: <3d07ab22@news.povray.org>

Ole Laursen wrote:

> It was rendering time in seconds - and you probably want shorter
> rendering times, right? :-)

I am currently doing a lot of benchmarking work and everything is starting 
to look "loops per second"... even my food!

-- 
Alessandro Coppo
a.coppo@<REMOVE_ME>iol.it
www.geocities.com/alexcoppo

Post a reply to this message

From: Thomas Willhalm
Subject: Re: povbench
Date: 13 Jun 2002 03:14:27
Message: <3d084652@news.povray.org>

Thomas Willhalm wrote:
>
> So, I've compiled it with -wp_ipo and should be able to post the result
> tomorrow morning.

The rendering times were almost the same - even slighlty longer. 
(IMO a variation of one or two percent should be accepted.)

Thomas

Post a reply to this message

From: Vadim Sytnikov
Subject: Re: povbench
Date: 13 Jun 2002 04:37:49
Message: <3d0859dd@news.povray.org>

"Thorsten Froehlich" <tho### [at] trfde> wrote:
> This is under Linux, of course.  It may just be a library issue that
doesn't
> exist under Windows.  gcc surely is not such a good compiler...

... but not such a bad compiler, either. In general, VC surpasses GCC in
terms of code quality (especially fp), of course, but consider this sample
code (I stumble upon such things, at times, as I actively use both gcc and
VC++), gcc 2.95.3-5 vs. MSC/C++ 12.00.8168 (VS 6.0):

----- test.c:

typedef struct { short a; char b, c; } D;

D func( void )
{
  D d = { 1, 2, 3 };

  return d;
}

----- test.bat:

@echo off
 gcc -c -S -O2 -fomit-frame-pointer test.c
 cl -c -FA -Ox -nologo test.c

----- test.s:

_func:
 movl $50462721,%eax
 ret

----- test.asm (_d$ = -4):

_func PROC NEAR
 push ecx
 mov WORD PTR _d$[esp+4], 1
 mov BYTE PTR _d$[esp+6], 2
 mov BYTE PTR _d$[esp+7], 3
 mov eax, DWORD PTR _d$[esp+4]
 pop ecx
 ret 0

Post a reply to this message

From: Kenneth Johansson
Subject: Re: povbench
Date: 14 Jun 2002 17:42:18
Message: <pan.2002.06.14.21.42.18.431811.1955@canit.se>

On Wed, 12 Jun 2002 10:40:12 +0200, Thomas Willhalm wrote:

> Thorsten Froehlich wrote:
> 
>> In article <3d05de17@news.povray.org> , Thomas Willhalm
>> <tho### [at] uni-konstanzde>  wrote:
>> 
>>> finally, I've found the time to compare the different compilations of
>>> povray on a Pentium IV.  I used megapovplus and modified povbench.pov
>>> from povray 3.5 beta to run on it.
>>>
>>> Running time in seconds:
>>>                P-IV   Athlon
>>> gcc 2.95.3    13354    7035
>>> gcc 3.0.1     11319    6555
>>> gcc 3.1        8971    5901
>>> icc 6         15907    5679
>>> icc 6 IV      10589
>> 
>> What are the results of the Windows version on the same system?
> 
> I'm sorry. Windows isn't installed on any of these computers. (Well, to
> be completely honest, there is Vmware running Windows NT4 on the Athlon,
> but IMHO a benchmark of this won't tell us anything.)
> 
> THomas
 
vmware dose have very low overhead for some type of programs.
If you render without output the only overhead you get is pagefault
and that should not be more than 1-4 percent from native speed. 

Once you put things on the screen or use system functions like writing/
reading to disk you loose.

Post a reply to this message

From: Spider
Subject: Re: povbench
Date: 26 Jun 2002 00:05:23
Message: <pan.2002.06.26.04.03.56.855358.1106@gentoo.org>

theese numbers were quite interesting, could you please run one more? 
Try CFLAGS="-march=athlon-xp -O3 -finline-functions -ffast-math 
-foptimize-sibling-calls  -ansi -march=i686 -DCPU=686" for the  athlon-xp
 and see if it differs some or more from the output with march=i686

gcc 3.0.x had march=athlon
gcc 3.1 had march=athlon-tbird, athlon-xp, athlon-mp and athlon-4. I
havent dug any deeper to see just what is different between them, except
for some submodel changes.


antoher thing to be tested would be -mfpmath="sse,387" which will attempt
to use bothe the sse and the i387  fp engine at the same time, thus
doubling(!) the amount of registers accessible.  I've got a slight feeling
this may do some interesting things for applications like POV .

also, since we're not using debugging here, it should be considered to use
-fomit-frame-pointer on gcc, thus freeing up another register, not always
desirable or noticeable in desktop applications, but this is a "special
case" so it should be ok :)


Regards,
  Spider
    ...a memory...



On Tue, 11 Jun 2002 13:26:34 +0200, Thomas Willhalm wrote:

> Hello,
> 
> finally, I've found the time to compare the different compilations of 
> povray on a Pentium IV.  I used megapovplus and modified povbench.pov from 
> povray 3.5 beta to run on it.
> 
> Running time in seconds:
>                P-IV   Athlon
> gcc 2.95.3    13354    7035
> gcc 3.0.1     11319    6555
> gcc 3.1        8971    5901
> icc 6         15907    5679
> icc 6 IV      10589
> 
> "P-IV" is a Intel(R) Pentium(R) 4 CPU 1.60GHz
> running SuSE Linux with kernel 2.4.16-4GB
> 
> "Athlon" is a AMD Athlon(TM) XP 1500+ (1343.051 MHz)
> running SuSE Linux with kernel 2.4.10-4GB
> 
> Compiling options were:
> gcc 2.95.3  -O3 -finline-functions -ffast-math -ansi -march=i686 -DCPU=686
> gcc 3.0.1   -O3 -finline-functions -ffast-math -foptimize-sibling-calls 
> -ansi -march=i686 -DCPU=686
> gcc 3.1     -O3 -finline-functions -ffast-math -foptimize-sibling-calls 
> -ansi -march=i686 -DCPU=686
> icc 6       -O3 -tpp6  -xK -unroll -ip
> icc 6 IV    -O3 -tpp7  -xW -unroll -ip
> 
> The last version is optimized for Pentium-IV. That's why the binary doesn't 
> run on the Athlon.
> 
> Best regards
> Thomas

-- 
begin  .signature
This is a .signature virus! Please copy me into your .signature!
See Microsoft KB Article Q265230 for more information.
end

Post a reply to this message

From: Thomas Willhalm
Subject: Re: povbench
Date: 1 Jul 2002 03:59:54
Message: <3d200bf9@news.povray.org>

Spider wrote:

> theese numbers were quite interesting, could you please run one more?
> Try CFLAGS="-march=athlon-xp -O3 -finline-functions -ffast-math
> -foptimize-sibling-calls  -ansi -march=i686 -DCPU=686" for the  athlon-xp
>  and see if it differs some or more from the output with march=i686
 
> antoher thing to be tested would be -mfpmath="sse,387" which will attempt

> also, since we're not using debugging here, it should be considered to use
> -fomit-frame-pointer on gcc, thus freeing up another register, not always
> desirable or noticeable in desktop applications, but this is a "special
> case" so it should be ok :)

Good points, at least the running time says so:
Running time in seconds:

gcc 2.95.3 7048s
gcc 3.0.1  6574s
gcc 3.1    5908s
gcc 3.1    5749s (new options)
icc 6      5699s

For the records: It's a AMD Athlon(TM) XP 1500+ (1343.051 MHz)
running SuSE Linux with kernel 2.4.10-4GB

Compiling options were:
gcc 2.95.3  
 -O3 -finline-functions -ffast-math -ansi -march=i686 -DCPU=686
gcc 3.0.1   
 -O3 -finline-functions -ffast-math -foptimize-sibling-calls -ansi 
 -march=i686 -DCPU=686
gcc 3.1     
 -O3 -finline-functions -ffast-math -foptimize-sibling-calls
 -ansi -march=i686 -DCPU=686
gcc 3.1 (new options)
 -march=athlon-xp -O3 -finline-functions -ffast-math 
-foptimize-sibling-calls
 -DCPU=686 -mfpmath="sse,387" -fomit-frame-pointer
icc 6       
 -O3 -tpp6  -xK -unroll -ip

Best regards 
Thomas

Post a reply to this message

From: Spider
Subject: Re: povbench
Date: 1 Jul 2002 05:51:38
Message: <20020701114927.295a9ccb.spider@gentoo.org>

begin  quote
On Mon, 01 Jul 2002 10:01:17 +0200
Thomas Willhalm <tho### [at] uni-konstanzde> wrote:

>  
> > also, since we're not using debugging here, it should be considered
> > to use-fomit-frame-pointer on gcc, thus freeing up another register,
> > not always desirable or noticeable in desktop applications, but this
> > is a "special case" so it should be ok :)
> 
> Good points, at least the running time says so:
> Running time in seconds:
> 
> gcc 2.95.3 7048s
> gcc 3.0.1  6574s
> gcc 3.1    5908s
> gcc 3.1    5749s (new options)
> icc 6      5699s
> 

Hmm, Thats  an interesting change in runtime, still not down at icc's
level, which may not be possible either, but its definitely closing in
here :)

since both gcc 3.1 and ICC support Profile Guided Optimization, that
could be another interesting thing to do tests on, although this may
border on doing it merely to get the most possible instead of doing it
for the usability of it ;)

Heres an excerpt from our ebuild where we use icc pgo and normal icc,
I'm not editing this since the comments may be nice for others who
follow this thread, please note, this is not copyrighted by me, but
GPL'ed (cute isnt it) where the copyright is to Gentoo Technologies Inc.

if [ "`use icc`" ]; then
# ICC CFLAGS
echo "s/gcc/icc/" >> makefile.sed

# Should pull from /etc/make.conf
# If you have a P4 add -tpp7 after the -O3
# If you want lean/mean replace -axiMKW with -x? (see icc docs for -x)
# Note: -ipo breaks povray
# Note: -ip breaks povray on a P3
echo "s/^CFLAGS =/CFLAGS = -O3 -axiMKW /" >> makefile.sed
# This is optimized for my Pentium 2:
#echo "s/^CFLAGS =/CFLAGS = -O3 -xM -ip /" >> makefile.sed
# This is optimized for Pentium 3 (semi-untested, I don't own one):
#echo "s/^CFLAGS =/CFLAGS = -O3 -xK /" >> makefile.sed
# This is optimized for Pentium 4 (untested, I don't own one):
#echo "s/^CFLAGS =/CFLAGS = -O3 -xW -ip -tpp7 /" >> makefile.sed

if [ "`use icc-pgo`" ]; then
IPD=${BUILDDIR}/icc-pgo
echo "s:^CFLAGS =:CFLAGS = -prof_dir ${IPD} :" >> makefile.sed
if [ ! -d "${IPD}" ]; then
mkdir -m 777 -p ${IPD}
echo "s/^CFLAGS =/CFLAGS = -prof_gen /" >> makefile.sed
else 
echo "s/^CFLAGS =/CFLAGS = -prof_use /" >> makefile.sed
fi
fi
else
# GCC CFLAGS
echo "s/^CFLAGS =/CFLAGS = -finline-functions -ffast-math /" >>
makefile.sed
echo "s/^CFLAGS =/CFLAGS = ${CFLAGS} /" >> makefile.sed
fi
sed -f makefile.sed makefile.orig > makefile

Well, this should be pretty much selfexplaining. BUILDIR is where the
data is stored and our compile time data as well.

this is re-edited as to not be wrapped too much in mail, but it still
needs checking for this.

//Spider

--
begin  .signature
This is a .signature virus! Please copy me into your .signature!
See Microsoft KB Article Q265230 for more information.
end

Post a reply to this message

Attachments:
Download 'us-ascii' (1 KB)

From: Rob Brown-Bayliss
Subject: Re: povbench
Date: 4 Jul 2002 17:38:02
Message: <3d24c03a@news.povray.org>

On Wed, 12 Jun 2002 07:18:23 +1200, Nicolas Calimet wrote:

> 	Interesting. Looks like the last gcc is worth installing.
> Well, I would have been glad to see gcc-3.0.4 intead of 3.0.1  :o) Do
> you have any idea about the advantage of -foptimize-sibling-calls ? And
> what is icc by the way ?

Hi all, I just installed gcc3.1, and rebuilt povray 3.1 (from a rh7.2 src
rpm package)

On a scene I am playing with, rendertime is 124 seconds, previous version
built with gcc3.0.4 was 144 seconds.  It's still a small scene, but
thats an aproximate 20% speed up on a PII300.  

Just another benifit of open source software...

Post a reply to this message

<<< Previous 10 Messages

Goto Initial 10 Messages