POV-Ray : Newsgroups : povray.unofficial.patches : PovRay faster : Re: PovRay faster Server Time
2 Sep 2024 00:15:26 EDT (-0400)
  Re: PovRay faster  
From: Warp
Date: 30 Jan 2001 11:32:06
Message: <3a76ec86@news.povray.org>
Daniel Jungmann <DSJ### [at] gmxnet> wrote:
: The plain C code is not very fast.

  There are three main stages of optimization when optimizing a program:

  1. Optimization of algorithms and data containers. Choosing one algorithm
or data container instead of another can have an enormous effect in
performance.
  I have a very good example of this from my personal experience: The first
version of my triangle mesh compressor used a list data container to store
all the vertex vectors of the triangles. When a new vector was added to the
list, the program searched through the whole list to see if the same vector
was already in there.
  This was very slow. It took more than 2 minutes to read a triangle mesh
with about 40000 smooth triangles.
  Then I decided to replace the list with a binary tree. After that the
program was able to read the same mesh in less than 10 seconds.

  2. Optimization of the code.
  It's quite usual to hear that "you don't need to optimize the code, the
compiler will make it for you much better than you". This is a lie.
  Yes, the compiler can make very good optimizations, but it isn't a genius.
It can't optimize many things that a human can optimize.
  For example this:

    unsigned res = 0;
    for(unsigned i = 1; i <= 10000; ++i) res += i;
    return res;

can be optimized to:

    return 50005000;

  However, it's quite rare that any compiler can do this. I have tested this
with two compilers (gcc and Sun's own cc) and they weren't able to optimize
that loop away (not even with -funroll-loops).
  It's quite clear that the optimized version is extremely much faster than
the unoptimized version.
  There are virtually endless amount of cases where a compiler is unable
to optimize something that can be optimized by hand. Another good example,
very similar to the above, is the following:
  This:

unsigned Sum(unsigned minval, unsigned maxval)
{
    unsigned res = 0;
    for(unsigned i = minval; i <= maxval; ++i) res += i;
    return res;
}

can be optimized to this:

unsigned Sum(unsigned minval, unsigned maxval)
{
    return (maxval-minval+1)*(minval+maxval)/2;
}

  No compiler can do that.


  3. Optimization of the machine code, usually with inline assembler.
This is the last resort when everything else fails and speed in a very
tight loop is critical and there's a cpu-dependant optimization that
the compiler is unable to generate.
  This is usually not a good way to go since it not only makes the
program cpu-dependant (and usually compiler-dependant), but also it
makes the code less maintainable.
  Sometimes you just have to do it (for example the Linux source code
has inline assembler in several places because you can't make everything
with pure C).
  You should only use this kind of optimization when the two first
optimizations have been done and they don't help.

-- 
char*i="b[7FK@`3NB6>B:b3O6>:B:b3O6><`3:;8:6f733:>::b?7B>:>^B>C73;S1";
main(_,c,m){for(m=32;c=*i++-49;c&m?puts(""):m)for(_=(
c/4)&7;putchar(m),_--?m:(_=(1<<(c&3))-1,(m^=3)&3););}    /*- Warp -*/


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.