POV-Ray: Newsgroups: povray.off-topic: Optimizations...: Re: Optimizations...

POV-Ray : Newsgroups : povray.off-topic : Optimizations... : Re: Optimizations...		Server Time 5 Jul 2025 15:22:37 EDT (-0400)

From: Invisible
Date: 23 Jul 2009 07:30:14
Message: <4a6849c6$1@news.povray.org>

>> Of course, matrix multiplication is a trivially simple task. (That's 
>> why they chose it.) But now consider building a huge, highly complex 
>> program that does a crapload of numerical work, and trying to figure 
>> out where the hell the performance is going, and how to fix it... Good 
>> luck with that.
> 
> Even in huge programs the bottlenecks are often only due to a tiny 
> proportion of code running a vast number of times (for instance, I am 
> pretty sure almost all of the CPU time for my CFD solver is spent 
> inverting matricies).  Compilers are already quite good at optimising 
> such things, but it appears they can still do a lot more.

I suspect few compilers are going to detect that you should transpose 
one matrix while performing matrix multiplication just by inspecting the 
souce code for the matric multiplication function.



Interestingly, in Haskell you could do something like

   matrix_multiply x y = core_multiply x (transpose y)

and then do something like

   {-# RULES "transpose^2" transpose . transpose = id #-}

What this does is everywhere the compiler sees a matrix being transposed 
twice, it automatically removes the transpositions. So if you multiply 
two matricies and one of them is *already* transposed, matrix_multiply 
won't bother transposing it again. It just gets passed to core_multiply 
unchanged.

You could store certain matricies permanently transposed, and then 
transpose them back just before passing them to matrix_multiple. The 
compiler will optimise out the transpose operation automatically. And 
all this without needing to modify the compiler to add a special 
optimisation pass; it's just some plain ordinary Haskell source code.

>> Would you like me to use lists, immutable arrays, mutable arrays, 
>> unboxed arrays or parallel arrays? Should I use 2D arrays or 1D 
>> arrays? Should I allow integers or doubles, or stick to only integers?
> 
> You could do a similar exercise to what the guy did.  Start off with a 
> really "obvious" way of doing it, then try out different things and time 
> each one.  Make a nice list like he did showing how much faster you can 
> make it than the "initial" version.

Indeed, this is my plan. I'm just trying to get the naive list-based 
version to work. (Currently it gives the wrong answers, which obviously 
isn't a lot of use...)

Post a reply to this message