POV-Ray: Newsgroups: povray.unix: Pentium 3 optimized binary

POV-Ray : Newsgroups : povray.unix : Pentium 3 optimized binary		Server Time 4 Jul 2025 05:12:21 EDT (-0400)

<<< Previous 10 Messages

Goto Initial 10 Messages

From: Mark Gordon
Subject: Re: Pentium 3 optimized binary
Date: 3 Aug 2002 07:43:45
Message: <pan.2002.08.03.11.47.46.301917.2129@povray.org>

On Fri, 02 Aug 2002 20:58:33 -0400, Spider wrote:

> export CFLAGS="optimziation"

Are you quite sure on your spelling, there? ;-)

-Mark Gordon

Post a reply to this message

From: Roz
Subject: Re: Pentium 3 optimized binary
Date: 3 Aug 2002 13:39:43
Message: <3D4C15A3.7040006@netscape.net>

Apache wrote:
> If people are continue improving algorithmic changes that are compiler and
> platform independent, wouldn't it be a good thing to keep track of those
> changes and work on a 3.5.1 or a 3.6 version?
> 
> 

Yea and it seems like the p.programming newsgroup helps with that.
You've probably already noticed it but if not, Micha has posted the
patch he was talking about to that newsgroup.

-Roz

Post a reply to this message

From: Christopher James Huff
Subject: Re: Pentium 3 optimized binary
Date: 3 Aug 2002 14:13:21
Message: <chrishuff-373F74.13041003082002@netplex.aussie.org>

In article <3d4b8aec@news.povray.org>, Micha Riser <mri### [at] gmxnet> 
wrote:

> You probably mean  Sqr((sin(EPoint[X])+sin(EPoint[Y])+sin(EPoint[Z]))/3);

Yeah, I screwed up the parentheses and the ".0" isn't necessary.

> Temporary variables are do not affect the performance mostly though. The 
> has to use several registers anyways.

I know, and a good compiler would probably optimize it to the same 
machine code, but there is no reason to use them, not even readability. 
(Do modern compilers even pay any attention to "register"?)
My point was it doesn't conform to any kind of "strict guidelines" other 
than maximum compatibility (it was apparently for compiling with an old 
386 compiler). The "noise" variable I just don't understand...I guess it 
helped some compiler with optimization or something.

> But to show the vector nature of this calculation it should be 
> written as:
> DBL value=0;
> for(int i=0; i<3; i++) value+=sin(EPoint[i]);
> return Sqr(value/3.0);
> 
> Of course this assumes that X,Y,Z are 0-2 index.

You mean for helping the compiler detect something that can be 
vectorized and doing it automatically? It won't pick out the possibility 
in the one-line version?
Would it get this version?
DBL value = 0;
value += sin(EPoint[0]);
value += sin(EPoint[1]);
value += sin(EPoint[2]);
return Sqr(value/3.0);

Or this (assuming a struct with x, y, and z components):
value += sin(EPoint.x);
value += sin(EPoint.y);
value += sin(EPoint.z);

I'm not surprised the for loop wasn't used in the existing 
version...SIMD stuff wasn't even a factor, so of course the code wasn't 
designed for it, and compilers probably weren't good enough at 
optimizing to get rid of the for() loop.

-- 
Christopher James Huff <chr### [at] maccom>
POV-Ray TAG e-mail: chr### [at] tagpovrayorg
TAG web site: http://tag.povray.org/

Post a reply to this message

From: Micha Riser
Subject: Re: Pentium 3 optimized binary
Date: 3 Aug 2002 14:27:03
Message: <3d4c2076@news.povray.org>

Christopher James Huff wrote:
> 
> You mean for helping the compiler detect something that can be
> vectorized and doing it automatically? It won't pick out the possibility
> in the one-line version?
> Would it get this version?
> DBL value = 0;
> value += sin(EPoint[0]);
> value += sin(EPoint[1]);
> value += sin(EPoint[2]);
> return Sqr(value/3.0);

No, unfortunately the Intel compiler (the only one that I have which does 
vectorisation) does not recoginze this. You explicitly have to use a loop. 
I have done rewriting vector.h in such a way. But I have no Pentium4 to 
test it :( 

> I'm not surprised the for loop wasn't used in the existing
> version...SIMD stuff wasn't even a factor, so of course the code wasn't
> designed for it, and compilers probably weren't good enough at
> optimizing to get rid of the for() loop.

Todays g++ 3.1 does a good job in loop-unrolling. But with my modified 
loop-using 'vector.h' it does still produce a slightly slower code (OK, 
maybe I have also made some mistakes in the converting..)

- Micha 

-- 
http://objects.povworld.org - the POV-Ray Objects Collection

Post a reply to this message

From: Micha Riser
Subject: Re: Pentium 3 optimized binary
Date: 3 Aug 2002 16:03:19
Message: <3d4c3706@news.povray.org>

Micha Riser wrote:
> 
> loop-using 'vector.h' it does still produce a slightly slower code (OK,

I have to correct this. The increasing number of background applications 
had influenced the testing. g++ 3.1 is equally fast when using a 'looped' 
vector.h.

- Micha

-- 
http://objects.povworld.org - the POV-Ray Objects Collection

Post a reply to this message

From: Warp
Subject: Re: Pentium 3 optimized binary
Date: 3 Aug 2002 16:13:52
Message: <3d4c3980@news.povray.org>

I only remembered an article written by Thorsten about the issue. I don't
remember the specifics.

-- 
#macro N(D)#if(D>99)cylinder{M()#local D=div(D,104);M().5,2pigment{rgb M()}}
N(D)#end#end#macro M()<mod(D,13)-6mod(div(D,13)8)-3,10>#end blob{
N(11117333955)N(4254934330)N(3900569407)N(7382340)N(3358)N(970)}//  - Warp -

Post a reply to this message

From: Christopher James Huff
Subject: Re: Pentium 3 optimized binary
Date: 3 Aug 2002 17:17:10
Message: <chrishuff-CEBC03.16080203082002@netplex.aussie.org>

In article <3d4c2076@news.povray.org>, Micha Riser <mri### [at] gmxnet> 
wrote:

> No, unfortunately the Intel compiler (the only one that I have which does 
> vectorisation) does not recoginze this. You explicitly have to use a loop. 
> I have done rewriting vector.h in such a way. But I have no Pentium4 to 
> test it :( 

So it will only help operations on arrays then...seems like a stupid 
limitation. There isn't some compiler directive that could tell it what 
can be optimized?

> Todays g++ 3.1 does a good job in loop-unrolling. But with my modified 
> loop-using 'vector.h' it does still produce a slightly slower code (OK, 
> maybe I have also made some mistakes in the converting..)

Well, todays g++ 3.1 didn't exist 10 years ago. ;-)
With a modern compiler, I wouldn't expect it to be much slower, but how 
could you get any improvement? The SIMD instructions can't handle double 
precision math as far as I know...they would help colors, but not 
vectors.

-- 
Christopher James Huff <chr### [at] maccom>
POV-Ray TAG e-mail: chr### [at] tagpovrayorg
TAG web site: http://tag.povray.org/

Post a reply to this message

From: Micha Riser
Subject: Re: Pentium 3 optimized binary
Date: 3 Aug 2002 17:37:57
Message: <3d4c4d35@news.povray.org>

Christopher James Huff wrote:

> So it will only help operations on arrays then...seems like a stupid
> limitation. There isn't some compiler directive that could tell it what
> can be optimized?

SIMD only work on contignous blocks of memory.. so you're likely to use 
arrays when the're useful.

> 
> With a modern compiler, I wouldn't expect it to be much slower, but how
> could you get any improvement? The SIMD instructions can't handle double
> precision math as far as I know...they would help colors, but not
> vectors.

Yes, SSE can help for colour calculations. With it you can do operations on 
4 floats simutanously. But SSE2 (which pentium4 and 64-bit AMD support) 
will work on double precision! That's why I am looking for someone with a 
pentium4...

- Micha

-- 
http://objects.povworld.org - the POV-Ray Objects Collection

Post a reply to this message

<<< Previous 10 Messages

Goto Initial 10 Messages