POV-Ray: Newsgroups: povray.advanced-users: function optimization question

POV-Ray : Newsgroups : povray.advanced-users : function optimization question		Server Time 4 Jul 2025 00:50:51 EDT (-0400)

<<< Previous 10 Messages

Goto Initial 10 Messages

From: Jan Walzer
Subject: Re: function optimization question
Date: 31 Mar 2002 16:57:02
Message: <3ca7862e@news.povray.org>

>   Ahh... This list awakens warm memories from the time I coded in asm for
>   Spectrum and later for DOS.

hehe ... yeah, me too ...

>   It really is starting to look like something I would like to code in
>   the future.
>   Making a pattern with asm-code. Cool.

... just realized ...

this would also mean: creating objects with asm-code, as we can
easily turn these patterns into isos ... *g

... I'm waiting for the first signatures in ASM-style ... ;)

Post a reply to this message

From: Rune
Subject: Re: function optimization question
Date: 31 Mar 2002 17:21:36
Message: <3ca78bf0@news.povray.org>


> I wonder if this is optimized if functions evaluator:
>
> select(
>   x*Ey+y*Ex-R1*Ex,
>   select(
>     z+3*x.
>     x*Ey+y*Ex-R1*Ex,
>     123
>   ),
>  17
> )
>
> Note expression "x*Ey+y*Ex-R1*Ex" appear twice.
> How many times is it calculated ?

Thorsten says twice. However, there is a way to avoid that.

See the (untested) code below.

#declare Func1 =
function(x,y,z,Ex,Ey,Value){
  select(
    Value,
    select( z+3*x, Value, 123 ),
    17
  )
}

#declare Func2 =
function(x,y,z,Ex,Ey){Func1(x,y,z,Ex,Ey,x*Ey+y*Ex-R1*Ex)}

I don't know in this case if the additional function call slows down more
than the calculation of the expression did, but it's worth trying out. In
cases where the same expression appears many times it can give a big
speed-up anyway.

Maybe you already knew this, but then, maybe you didn't, so...

Rune
--
3D images and anims, include files, tutorials and more:
Rune's World:  http://rsj.mobilixnet.dk (updated Feb 16)
POV-Ray Users: http://rsj.mobilixnet.dk/povrayusers/
POV-Ray Ring:  http://webring.povray.co.uk

Post a reply to this message

From: Thorsten Froehlich
Subject: Re: function optimization question
Date: 31 Mar 2002 17:35:40
Message: <3ca78f3c@news.povray.org>

In article <3ca78098@news.povray.org> , "Jan Walzer" <jan### [at] lzernet> wrote:

> I once had a chance, to speed up the Bresenham by a factor of 3 by a similar
> method. The thing is that you can decide at runtime, which opcodes get
> executed, depending on some runtime values As there are(were) different
> opcodes used for a near-jump (target-address, beein in the same codesegment)
> and a far-jump (jump to anywhere in memory) there was no easy chance, to work
> only with pointers, and only change these...

Not only is this a terrible idea to code anything, it also kills all modern
processor performance.  Well, all but x86 processors of course for which
designers actually have to interlock (or use common) the data and instruction
caches to support such nonsense.  Not that it won't cripple performance for
them as well these days, but on a good old Pentium it will still be fast...

On any reasonable architecture not supporting 25 years of eight bit legacy
backward compatibility and other junk self-modifying code will fortunately
break! :-)

    Thorsten

____________________________________________________
Thorsten Froehlich, Duisburg, Germany
e-mail: tho### [at] trfde

Visit POV-Ray on the web: http://mac.povray.org

Post a reply to this message

From: Jan Walzer
Subject: Re: function optimization question
Date: 31 Mar 2002 17:50:57
Message: <3ca792d1@news.povray.org>

"Thorsten Froehlich" <tho### [at] trfde> wrote:
> > I once had a chance, to speed up the Bresenham by a factor of 3 by a similar
> > method. The thing is that you can decide at runtime, which opcodes get
> > executed, depending on some runtime values As there are(were) different
> > opcodes used for a near-jump (target-address, beein in the same codesegment)
> > and a far-jump (jump to anywhere in memory) there was no easy chance, to work
> > only with pointers, and only change these...
>
> Not only is this a terrible idea to code anything, it also kills all modern
[...]

> On any reasonable architecture not supporting 25 years of eight bit legacy
> backward compatibility and other junk self-modifying code will fortunately
> break! :-)

Hey ... for my studies, I had the chance to "port" this x86-code to
MIPS-ASM ... It was perfectly possible, meaning, it worked...
... I couldn't messure any speedup, as it run on a software-MIPS-simulater,
called "SPIM" ... if it was useful, is another chapter, but at least it was
fun (for me, at least) ...

... but hey: We all know, you prefer Mac ;) ...

Post a reply to this message

From: Warp
Subject: Re: function optimization question
Date: 31 Mar 2002 21:52:55
Message: <3ca7cb87@news.povray.org>

I think that the problem with self-modifying code nowadays is that, if
supported by the processor (not all processors support that; in them the
code to be executed is read-only), it effectively kills most optimizations
performed by the processor.
  If the code being executed is modified, it invalidates the pipelines and
perhaps even the code in the cache, plus all the branch prediction stuff
and so on. This causes the same effect as loading the code from RAM for
the first time. If this is done in each loop, it's extremely slow.
  If a very tight loop modifies itself, it could result in a code which
is several tens of times slower than an equivalent code with no
self-modification.

  Of course self-modification is ok if you first modify the code to be
executed, and then this code is executed for a long period of time.
  In fact, the OS does exactly this when it loads an exe file to memory:
It modifies its long jump addresses to match the place in memory where it
loaded it (with the help of the relocation table which can be found in
the header of the exe file).

-- 
#macro M(A,N,D,L)plane{-z,-9pigment{mandel L*9translate N color_map{[0rgb x]
[1rgb 9]}scale<D,D*3D>*1e3}rotate y*A*8}#end M(-3<1.206434.28623>70,7)M(
-1<.7438.1795>1,20)M(1<.77595.13699>30,20)M(3<.75923.07145>80,99)// - Warp -

Post a reply to this message

From: Mark Wagner
Subject: Re: function optimization question
Date: 1 Apr 2002 00:14:06
Message: <3ca7ec9e@news.povray.org>

Jan Walzer wrote in message <3ca7838e@news.povray.org>...
>Ahhh ... nice ...
>
>Thats what I wanted to read ...
>
>Seems very complete and useful ...
>
>Will this get included into the official doc ? ...
>I would much apreciate this ...

Did you notice that disclaimer at the end of the message?

> OF COURSE THIS IS ALL SUBJECT TO COMPLETE CHANGE AND REDESIGN IN CURRENT
AND
> FUTURE VERSIONS OF POV-RAY!

It won't make the official documentation.

--
Mark

Post a reply to this message

From: Thorsten Froehlich
Subject: Re: function optimization question
Date: 1 Apr 2002 04:26:02
Message: <3ca827aa@news.povray.org>

In article <3ca7cb87@news.povray.org> , Warp <war### [at] tagpovrayorg>  wrote:

>   If the code being executed is modified, it invalidates the pipelines and
> perhaps even the code in the cache

There is a far more severe problem here:  All processor architectures
developed in the past 20 years (mostly RISC or VLIW ones of course) have
independent data and instruction caches as well as independent data and
instruction MMUs.  Consequently the modified *data* is in the data cache and
the processors has no idea it is supposed to be *code*.  In fact it cannot
know what it is.  The only reliable means the programmer has to be sure it
will be executed is to *flush* the data and instruction cache.

Intel, on the other hand, maintained a kind of internal common bus at least up
to the Pentium Pro.  That is, instruction and data cache share a common
interface to the actual bus interface.  On other architectures the bus
interface provides two separate busses - one to the data and one to the
instruction cache.  What this effectively implies is that until Intel somehow
changed the design (I have no idea how current P4s detect self-modifying code,
if they can do so efficiently at all!), the speed and efficiency of the
instruction cache was reduced.

>   Of course self-modification is ok if you first modify the code to be
> executed, and then this code is executed for a long period of time.

And it will flush the caches accordingly.  The same needs to be done for
just-in-time compilers.

    Thorsten

____________________________________________________
Thorsten Froehlich, Duisburg, Germany
e-mail: tho### [at] trfde

Visit POV-Ray on the web: http://mac.povray.org

Post a reply to this message

From: Thorsten Froehlich
Subject: Re: function optimization question
Date: 1 Apr 2002 04:27:54
Message: <3ca8281a@news.povray.org>

In article <3ca792d1@news.povray.org> , "Jan Walzer" <jan### [at] lzernet> wrote:

> Hey ... for my studies, I had the chance to "port" this x86-code to
> MIPS-ASM ... It was perfectly possible, meaning, it worked...
> ... I couldn't messure any speedup, as it run on a software-MIPS-simulater,
> called "SPIM" ... if it was useful, is another chapter, but at least it was
> fun (for me, at least) ...

SPIM is indeed very neat except that its programmers made a too big fuss about
byte order and got the memory view all wrong :-(

    Thorsten

____________________________________________________
Thorsten Froehlich, Duisburg, Germany
e-mail: tho### [at] trfde

Visit POV-Ray on the web: http://mac.povray.org

Post a reply to this message

From:
Subject: Re: function optimization question
Date: 2 Apr 2002 01:01:23
Message: <k6iiaucdrm397ifjkt0d0or0145fqh0p0n@4ax.com>

On Mon, 1 Apr 2002 00:22:09 +0200, "Rune" <run### [at] mobilixnetdk> wrote:
> Maybe you already knew this, but then, maybe you didn't, so...

Yes, I knew that as I said in rest of thread. 

ABX

Post a reply to this message

From:
Subject: Re: function optimization question
Date: 2 Apr 2002 12:05:41
Message: <pcojau0tsr5hk9nprsrf38s3t5ae1ht9ro@4ax.com>


wrote:
> I wonder if this is optimized if functions evaluator:

I want to keep all issues about such optimizations in one thread if possible so
I ask next question here again:

Let's cosider such code:

#local F1=function{...};
#local F2=function{F1(x,y,z)};
#local F3=function{F2(x,y,z)};
#local F4=function{F3(x,y,z)};

After parsing all F1, F2, F3, F4 point to the same function or waste time for
call ancestors ? Mainly I ask for case of macro and/or array creation, for
example:

#local F1=function{x};
#local F2=function{y};

#macro Operation( Function )
  /* do something with Function */
#end

#macro Test(Array)
  #local Size=dimension_size(Array,1);
  #local C=0;
  #while (C<Size)
    Operation( function{ Array[C](x,y,z) } )
    #local C=C+1;
  #end
#end

Test( array[2]{ function{F1(x,y,z)} , function{F2(x,y,z)} } )

ABX

Post a reply to this message

<<< Previous 10 Messages

Goto Initial 10 Messages