POV-Ray: Newsgroups: povray.off-topic: Optimizations...

POV-Ray : Newsgroups : povray.off-topic : Optimizations...		Server Time 13 Jul 2025 06:35:25 EDT (-0400)

<<< Previous 10 Messages

Goto Initial 10 Messages

From: Invisible
Subject: Re: Optimizations...
Date: 23 Jul 2009 08:14:35
Message: <4a68542b$1@news.povray.org>

>> Indeed, this is my plan. I'm just trying to get the naive list-based 
>> version to work. (Currently it gives the wrong answers, which 
>> obviously isn't a lot of use...)
> 
> I could have written a working C version by now :-D

I have a working Haskell version by now. (Apparently I mixed up the rows 
and columns, hence the bogus answers.) It's really not hard. I've 
actually spent more time building a test harness than coding the matrix 
multiplication so far.

In case you care, the code is:

type Matrix x = [[x]]

multiply :: (Num x) => Matrix x -> Matrix x -> Matrix x
multiply m1 m2 = [ [ sum $ zipWith (*) row col | col <- transpose m2 ] | 
row <- m1 ]

That's the whole program. (Minus the part where it creates some 
matricies, multiplies then, and prints out the result.) Complex, eh?

Now, in terms of speed... Well, it takes about 62 seconds to multiply 
two 512x512 matricies. But hey, this is only the simplest, dumbest 
version I could come up with. Let's see what happens (if anything) when 
I start applying a few optimisations...

> Also I wonder how his results compare to something like MatLab that 
> supposedly has very fast matrix handling functions built it...

Well, matrix multiplication is a built-in MatLab primitive. I would 
imagine it's comparable to BLAS (or possibly parallel BLAS)...

Post a reply to this message

From: Vincent Le Chevalier
Subject: Re: Optimizations...
Date: 23 Jul 2009 08:53:59
Message: <4a685d67$1@news.povray.org>

Invisible a écrit :
> It seems we have reached an age where the complexity class of an
> algorithm is almost irrelevant, and the only important property is
> the specific low-level details of exactly how the algorithm has been
> implemented on a specific hardware platform.
> 
> This is the antithesis of high-level programming.

Actually the complexity class of the algorithm remains exactly as
relevant because it just says how the computation time asymptotically
evolves as the data gets bigger. The optimization described here do not
change that I believe. Just the multiplicative constant in front, which
in practice is of course important as well ;-)

-- 
Vincent

Post a reply to this message

From: Neeum Zawan
Subject: Re: Optimizations...
Date: 23 Jul 2009 11:37:09
Message: <4a6883a5$1@news.povray.org>

On 07/23/09 04:19, Invisible wrote:
> Just for giggles, I could implement this in Haskell and see how many
> million times slower it is. But then, since it isn't running on the same
> hardware, it would be kind of a moot comparison.

	Well, they _do_ provide the source...

-- 
No, Taco Bell is NOT the Mexican Phone Company!

Post a reply to this message

From: Neeum Zawan
Subject: Re: Optimizations...
Date: 23 Jul 2009 11:38:21
Message: <4a6883ed$1@news.povray.org>

On 07/23/09 05:42, Invisible wrote:
> This is the antithesis of high-level programming.

	As is all numerical computation.

-- 
No, Taco Bell is NOT the Mexican Phone Company!

Post a reply to this message

From: Kevin Wampler
Subject: Re: Optimizations...
Date: 23 Jul 2009 15:10:16
Message: <4a68b598$1@news.povray.org>

Vincent Le Chevalier wrote:
> Actually the complexity class of the algorithm remains exactly as
> relevant because it just says how the computation time asymptotically
> evolves as the data gets bigger. The optimization described here do not
> change that I believe. Just the multiplicative constant in front, which
> in practice is of course important as well ;-)

I think that BLAS probably uses a slightly more asymptotically efficient 
multiplication algorithm for matrices this size, but it's interesting 
that at best it only accounts for a small fraction of the speedup.

Post a reply to this message

From: Patrick Elliott
Subject: Re: Optimizations...
Date: 23 Jul 2009 17:18:04
Message: <4a68d38c@news.povray.org>

Invisible wrote:
> Oh, sure, I understand *why* this happens. But it is very, very 
> disturbing to realise that just doing the same instructions in a 
> different order, or storing the same data in a different order, can have 
> such a vast effect on performance. It seems we have reached an age where 
> the complexity class of an algorithm is almost irrelevant, and the only 
> important property is the specific low-level details of exactly how the 
> algorithm has been implemented on a specific hardware platform.
> 
> This is the antithesis of high-level programming.
> 

Was an example if this I saw recently with multi-core programming. If 
you changed the memory "directly" which was used between multiple 
threads in the program, then all cores would have to wait for the 
"current" core to finish handling the data, thus unlocking the memory, 
so that the other cores would operate. The result would be that every 
single core would be wobbling between wait and active, and your speed 
would drop *under* that of a single core working on the same process. 
The fix was simple. Make your function so you pass the value (not a 
pointer), handle it internally as a temporary variable, then return it, 
where the calling function would only set the value. The memory would be 
free while the function was being called and executing, and only "lock" 
during the brief moment the final value was set. Result - the expected 
increase in speed, as you increase the number of cores.

Yeah, one they got to multi-core, the idea of, "The language doesn't 
need to know anything about the hardware.", was lost, to some extent. 
But, then, you also get stupid things like GCC optimizing out NULL 
tests, which actually caused two Linux kernal versions to suffer a 
potential, Windows style, "root level access no matter what account you 
are on", problem. The compiler needs to be smart enough to "know" you 
are compiling for multi-cores, and warn you about such things (or avoid 
optimizing in a way that causes them). But, high level languages where 
all developed "before" this was an issue, so its not surprising its 
shown up now.

-- 
void main () {

     if version = "Vista" {
       call slow_by_half();
       call DRM_everything();
     }
     call functional_code();
   }
   else
     call crash_windows();
}

<A HREF='http://www.daz3d.com/index.php?refid=16130551'>Get 3D Models, 
3D Content, and 3D Software at DAZ3D!</A>

Post a reply to this message

From: Darren New
Subject: Re: Optimizations...
Date: 23 Jul 2009 17:53:44
Message: <4a68dbe8@news.povray.org>

Patrick Elliott wrote:
> The compiler needs to be smart enough to "know" you 
> are compiling for multi-cores, and warn you about such things (or avoid 
> optimizing in a way that causes them).

More precisely, the compiler needs to "know" what your processes are, 
instead of leaving it as an opaque library outside the semantics of the 
language completely. See the ongoing thread elsewhere. ;-)

-- 
   Darren New, San Diego CA, USA (PST)
   "We'd like you to back-port all the changes in 2.0
    back to version 1.0."
   "We've done that already. We call it 2.0."

Post a reply to this message

From: Chambers
Subject: Re: Optimizations...
Date: 24 Jul 2009 00:07:22
Message: <4a69337a$1@news.povray.org>

Invisible wrote:
> Sure. Because working out what order an arbitrary program wants to 
> access its data in is not, in any way, equivilent to the halting 
> problem. No sir!

Remember, the Halting Problem is unsolvable in the general case, but may 
be solvable in specific cases.

It's the same thing with compiled code.  While the compiler can't 
guarantee that *every* code file will be compiled to its optimal machine 
code, it can guarantee that *sets* of code files will.  The job of the 
compiler developer is to identify these potential sets, and teach the 
compiler to recognize them.

-- 
Chambers

Post a reply to this message

From: Invisible
Subject: Re: Optimizations...
Date: 24 Jul 2009 04:08:05
Message: <4a696be5$1@news.povray.org>

>> The compiler needs to be smart enough to "know" you are compiling for 
>> multi-cores, and warn you about such things (or avoid optimizing in a 
>> way that causes them).
> 
> More precisely, the compiler needs to "know" what your processes are, 
> instead of leaving it as an opaque library outside the semantics of the 
> language completely. See the ongoing thread elsewhere. ;-)

And I would suggest that geniunely "high-level" languages (i.e., 
languages were you don't manually frob low-level machine resources) have 
a much greater chance of adapting to multicore programming...

Post a reply to this message

<<< Previous 10 Messages

Goto Initial 10 Messages