POV-Ray: Newsgroups: povray.programming: CUDA - NVIDIA's massively parallel programming architecture: Re: CUDA - NVIDIA's massively parallel programming architecture

POV-Ray : Newsgroups : povray.programming : CUDA - NVIDIA's massively parallel programming architecture : Re: CUDA - NVIDIA's massively parallel programming architecture		Server Time 1 Jul 2025 02:13:45 EDT (-0400)

From: Chambers
Date: 21 Apr 2007 14:29:24
Message: <462a5804@news.povray.org>

_theCardinal wrote:
> Ben Chambers <ben### [at] pacificwebguycom> wrote:
>> No good, they're only single precision.  Plus, each shader unit would
>> need access to the entire scene file, which would be a pain in the a**
>> to code.
>>
>> ...Chambers
> 
> According to the CUDA programming guide published by NVidia the GPGPU
> architecture is as competent as any 32 bit processor.  More precise
> computations can be simulated using multiple registers for a computation
> instead of a single register if my memory serves - so I seriously doubt
> this is a serious obstacle.

But double precision is actually 64bit.  Until recently (I don't 
remember exactly which model), NVidia didn't even do full 32bit FP (that 
is, single precision), but only 24.  POV-Ray, for the last dozen years, 
has done 64bit FP (double precision), as the extra accuracy is necessary 
for the types of computations it does.

Sure, you can simulate it in the same way you can use two integer units 
to simulate a fixed point number, but the result is slow.  Perhaps if 
Intel surprises everyone, and releases their next graphics chip as a 
double precision FP monsters, we'd be able to take advantage of that, 
but the current ATI / NVidia cards aren't up to the task of dealing with 
POV-Ray.

> The major difficulty involved would be preparing the pov-ray source to run
> efficiently on a SIMD architecture - native code may run out of the box
> through the provided compiler, but the results would be poor at best
> without optimization to take advantage of the particular memory hierarchies
> involved.

Once the 3.7 source is out, it should be much easier, as the major task 
is simply fitting it to a parallel paradigm.  Having already done that, 
porting to different parallel architectures should be trivial (relative 
to the original threading support, that is).

-- 
...Ben Chambers
www.pacificwebguy.com

Post a reply to this message