|
 |
>> Indeed. And on the GPU, you can say "for all these thirty pixels
>> you're processing, multiply each one by the corresponding texture
>> pixel". For example. One instruction, executed on 30 different pairs
>> of data values. SIMD.
>
> If you've ever done a array processing with eg MatLab you will be
> familiar with how to restructure algorithms to work in this sort of
> environment.
Indeed, this is part of what I hated about Matlab; If it isn't an array,
you can't do anything with it. (That and the absurd syntax...)
> Things like doing "OUTPUT = A * B + (1-A) * C" is a single instruction
> that can operate on every value of the array, but essentially lets you
> choose output B or C based on the value of A. This is often very useful
> and fast for converting typical one-value-at-a-time algorithms.
Me being me, I would have expected a conditional statement to be faster
than a redundant computation.
I guess back in the days before FPUs, when the RAM was faster than the
CPU, that might even have been true. But today it seems it doesn't
matter how inefficient an algorithm is, just so long as it has good
cache behaviour and doesn't stall the pipeline. *sigh*
> Reminds me of a built-in MatLab function to convert an RGB image to HSV.
> The function was actually looping through every pixel and calling the
> convert function (which had several if's in it). I rewrote the
> conversion function to work on whole arrays at a time and it was orders
> of magnitude faster. That's what you need to do for GPU programming too.
Whenever you have a system like MatLab or SQL which is inherantly
designed to do parallel processing, letting the intensively-tuned
parallel engine do its stuff rather than explicitly looping yourself is
always, always goig to be faster. ;-)
Post a reply to this message
|
 |