|
|
file ---matrices.h---
/***************************************************************************
* Global preprocessor defines
****************************************************************************
**/
add at line 34: #define Assign_Matrix(a,b) memcpy(a,b,sizeof(MATRIX))
add at line 35: #define MZero(m) Assign_Matrix( m, MATRIX__MZero )
add at line 36: #define MIdentity(m) Assign_Matrix( m, MATRIX__MIdentity )
/***************************************************************************
Global variables
****************************************************************************
/
add at line 48: extern MATRIX MATRIX__MIdentity;
add at line 49: extern MATRIX MATRIX__MZero;
delete line 55 which originaly contain: void MZero (MATRIX result);
delete line 56 which originaly contain: void MIdentity (MATRIX result);
file ---matrices.c---
/***************************************************************************
Local variables
***************************************************************************/
at line 46 add:
MATRIX MATRIX__MIdentity =
{
{ 1.0, 0.0, 0.0, 0.0 },
{ 0.0, 1.0, 0.0, 0.0 },
{ 0.0, 0.0, 1.0, 0.0 },
{ 0.0, 0.0, 0.0, 1.0 }
};
MATRIX MATRIX__MZero =
{
{ 0.0, 0.0, 0.0, 0.0 },
{ 0.0, 0.0, 0.0, 0.0 },
{ 0.0, 0.0, 0.0, 0.0 },
{ 0.0, 0.0, 0.0, 0.0 }
};
delete lines 55 to 146
/*
void MZero (MATRIX result);
void MIdentity (MATRIX result);
*/
**************************
dmi### [at] pttyu
http://members.xoom.com/dmilos/
**************************
Post a reply to this message
|
|
|
|
On Sat, 1 Apr 2000 17:58:46 +0200, Dejan D. M. Milosavljevic wrote:
>(some code that speeds up matrices very slightly on some architectures)
It's a widely held belief in computer circles that one should first look
to make things faster by fixing the algorithms, then optimize those pieces
of code that really, really need it by making AND PROFILING incremental
changes like this one. POV still has lots of places where the algorithms
could be sped up; spending time on tiny speedups like this doesn't really
make sense.
That's just my opinion, of course.
Post a reply to this message
|
|
|
|
In article <38e6f137@news.povray.org> , "Dejan D. M. Milosavljevic"
<dmi### [at] pttyu> wrote:
> add at line 34: #define Assign_Matrix(a,b) memcpy(a,b,sizeof(MATRIX))
> add at line 35: #define MZero(m) Assign_Matrix( m, MATRIX__MZero )
> add at line 36: #define MIdentity(m) Assign_Matrix( m, MATRIX__MIdentity )
At least for MZero to copy the data will be significantly slower than the
current solution. It will require the processor to fetch data from memory
and copy it to another, while the current code will only fill an area of
memory with a fixed value.
Regarding MIdentity, the memory copy might be faster, depending on the
branch penalty of the processor pipeline in the current implementation
compared to the memory access penalty.
However, it should be faster to replace it with the MZero code and then add
the diagonal "one"s outside the loop.
Conclusion: Sometimes what seems less work can turn out to be much more
work.
Thorsten
Post a reply to this message
|
|
|
|
Or in code this could look like:
void MZero (MATRIX result)
{
register int i;
for (i = 0 ; i < 16 ; i++) // yes this is legal ANSI C/ISO C++
{
result[i] = 0.0;
}
}
void MIdentity (MATRIX result)
{
register int i;
for (i = 0 ; i < 16 ; i++)
{
result[i] = 0.0;
}
for (i = 0 ; i < 4 ; i++)
{
result[i][i] = 1.0;
}
}
Of course manually unrolling would even account for bad compilers, but the
code above is already unreadable enough. And, some compilers will even be
able to unrole the current nested loop code, at least that in MZero.
Thorsten
____________________________________________________
Thorsten Froehlich, Duisburg, Germany
e-mail: tho### [at] trfde
Visit POV-Ray on the web: http://mac.povray.org
Post a reply to this message
|
|