POV-Ray : Newsgroups : povray.binaries.programming : matrices.c : Re: matrices.c Server Time
20 Apr 2024 06:31:12 EDT (-0400)
  Re: matrices.c  
From: Jérôme Grimbert
Date: 28 Nov 2001 04:00:35
Message: <3C04A7B5.304110D@atosorigin.com>
Tor Olav Kristensen wrote:
> 

> >
> > Tor Olav Kristensen wrote:
> > >
> > > (from megasrc07.zip)
> > >
> > > I have just had a little peek into that c-file
> > > and found that it might be possible to do some
> > > simplifications of the code within it.
> > >
> > [SNIP]
> > > This introduces one extra local variable; omc,
> > > but I hope there will be a little speed gain anyway.
> >
> > Do you have any idea WHEN Compute_Axis_Rotation_Transform is called ?
> > And how many time ?
> > My current guess is (because I did not check yet,
> > only performing from memory of the code):
> >  - At parse time only.
> >  - once per invoking directive.
> 
> Yes, I checked were it was called from prior to posting,
> - and found that it was only called from Express.c in
> the official version (v3.1g).
> 
> But I found that in MegaPOV it was also called from photons.c;
> i.e. from ShootPhotonsAtObject() in relation with the jitter
> option for area lights.
> 

I did not know about MegaPov sources.

> > If you were trying to get some significant speed improvement, I'm afraid
> > you will be disappointed: it's in the wrong location, at the wrong time.
> 
> Hmmm...
> I don't understand how I can be looking in the wrong location
> at the wrong time.

Wrong location and time : 
   trying to speed the parser (location of the optimisation), 
   for an very rarely called function (how many time).



> 
> I was only looking at a math-part of some POV-Ray code that I
> found interesting and told others about something that seemed
> to be a simplification to me.
> 
> Would you people prefer that I shut up about my findings, just
> because POV developers are busy with v3.5 coding ?

I cannot speak for the TAG or the Pov-Team, 
but if you expect your finding to be incorporated in 3.5, 
I think you will be disappointed.
Most people want now a stable and final 3.5, with sources,
then the everybody-patching process could restart 
and the super/mega/hyper/pov family shows up again.
Then you could make your own version of povray and show us the speed up.
(which may be real for photons, I do not know).

Moreover, I do not have the code of 3.5, do you ?


> 
> My opinion is that knowledge about any potential speed
> improvements and simplifications to POV-Ray might be valuable
> in the future (be it mathematical, logical or algorithmic).
> 

I think there is an entry for you in the FAQ...

> I think that if one posts such suggestions here, there is a
> slight chance that will be read and hopefully remembered when
> the time to rewrite POV-Ray comes.

If you really want to search for optimisation, in 3.1, I have one
to keep you busy: In the pigment evaluation, due to a limited number 
of arguments, one heavy something is done twice (once at top level, 
and another time in a called function). 
Find that, and then solves it cleanly.

> 
> (Yes, I see now that I should probably should have posted to
> povray.programming instead.)
> 
> > P.S.: If I am wrong, do not hesitate: Open fire... I will apologise later.
> >
> 
> The fact that v3.1g does not use Compute_Axis_Rotation_Transform()
> to do time consuming stuff, while MegaPOV seems to do, suggests
> that it might be wise to code such small "general" routines with
> care.

If you really want to be smarter than the compiler, you might as well
replace the 'transform->matrix' indirection to something more direct.
It will be useless with most modern CPU, it will render the code more
obfuscated.
You may also want to assign V1[...] to some variable, because they are
used a lot and some believe than index of array take time to resolve.
It would be also useless.

I agree nevertheless with you, that care should be taken to correctly
write the code to avoid unnecessary operations, but the most important
thing should be to COMMENT the tricky code at high/medium and low level.
(High level being the pure math thing, : 
        compute a rotation matrix around axis V at the origin...
 Medium level : 
     formula of the matrix is ....
 Low level:
     (1 - cosx) is used everywhere, do it only once.
)


> 
> > P.S.2: Smart compiler may already have factored
> > the (1 - cosx) result in a register, so forcing to use a named variable
> > might even be counter productive (because it must be written to the memory
> > location, for nothing).
> 
> Yes, a colleague of mine told me that too this morning,
> 
> BUT:
> Did you look thoroughly at this part of the code I suggested ?
> 
Not really, I confess.

>   cosx = cos(angle);
>   sinx = sin(angle);
>   omc = 1.0 - cosx;
> 
>   transform->matrix[0][0] = V1[X] * V1[X] * omc + cosx;
>   transform->matrix[0][1] = V1[X] * V1[Y] * omc + V1[Z] * sinx;
>   transform->matrix[0][2] = V1[X] * V1[Z] * omc - V1[Y] * sinx;
> 
>   transform->matrix[1][0] = V1[X] * V1[Y] * omc - V1[Z] * sinx;
>   transform->matrix[1][1] = V1[Y] * V1[Y] * omc + cosx;
>   transform->matrix[1][2] = V1[Y] * V1[Z] * omc + V1[X] * sinx;
> 
>   transform->matrix[2][0] = V1[X] * V1[Z] * omc + V1[Y] * sinx;
>   transform->matrix[2][1] = V1[Y] * V1[Z] * omc - V1[X] * sinx;
>   transform->matrix[2][2] = V1[Z] * V1[Z] * omc + cosx;
> 
> If so, I hope that you noticed that the (1.0 - cosx) expression
> here appears in all of the transform lines, while it did not
> appear in 3 of those lines originally.

The original lines where easier for math people, because they were
simply the traditionnal matrix. 
Your lines are more difficult to recognize as a rotation (at least for me).

> 
> I doubt that all smart C-compilers today can analyse algebraic
> expressions and then rearrange them in order to collect items
> (that are common between several C-expressions) and place them
> outside parenthesis in all the relevant C-expressions.

They might do it with a two pass optimiser: first pass count the use of
 sub-expression. second pass assign a register for the most useful
 sub-result, according to the available register in the window.

But they do not have to.
All they usually need is to keep a symbolic map of the register contents,
so that if (1-cosx) is still in a register, they used it directly.
 
It's not really useful on Intel family, due to the lack of generic
registers in x8086 mode, but on most RISC cpu it make the code really flies.
(It's also great on the old M680xx family, which is a CISC cpu).

If the optimiser is allowed to perform out-of-order execution, it can even 
be more surprising. (you may want to try to store V1[X]*omc, V1[Y]*omc and
 V1[Z]*omc , because these three sub-expression are used three time in your
code.

Anyway, rearranging expressions is also the very first step of 
common optimisation, but usually they try only to factor constant expressions.



If povray was not portable across various CPU, you could even go further
with the exact timeframe of CPU operation (with also possible out-of-order
optim)
and hand-code in assembly the PERFECT sequence of operation.

Alas, even in the pentium family, the Perfect sequence of P1 is not the
perfect one for P2/P3 and not even for P4. (And you did not look at AMD...)


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.