POV-Ray: Newsgroups: povray.off-topic: GPU rendering: Re: GPU rendering

POV-Ray : Newsgroups : povray.off-topic : GPU rendering : Re: GPU rendering		Server Time 12 Jul 2025 09:06:21 EDT (-0400)

From: Chambers
Date: 16 Jan 2010 13:08:31
Message: <4b52009f$1@news.povray.org>

Invisible wrote:
> Chambers wrote:
> 
>> 1) Support for sophisticated branching
> 
> When this happens, the GPU will be exactly the same speed as the CPU.
> The GPU is fast *because* it doesn't support sophisticated branching.

That's too bad, because POV requires sophisticated branching.

>> 2) Full double-precision accuracy
> 
> This already exists apparently. (E.g., my GPU supports double-precision 
> math.)

Yes, but there are still relatively few cards in consumer machines that 
fully support double precision.

>> 3) Large memory sets (other than textures)
> 
> My GPU has access to just under 1GB of RAM. How much do you want?

Last I checked, each shader was limited to accessing a very small amount 
(I think it was less than 1MB, though this may no longer be accurate for 
current top of the line cards) of shared memory, and then had access to 
textures.

If you could figure out a way to store your data as a texture (ie, an 
array of data) then you didn't have a problem.  Of course, textures 
aren't designed to hold distinct values in them (like a plain array), so 
you have to tell the card to disable all those optimizations like 
blending & filtering... you know, all the things that were designed 
thinking that textures were actually images.

>> 4) Independent shaders running on distinct units.
> 
> What exactly do you mean by that?

POV often needs to find the intersection of a single ray with a single 
object.

GPUs still function by calling blocks of shaders with the same program 
(this is how they get their speed; even though each individual shader is 
relatively slow, the whole block together is considered fast), and very 
similar data.

Now, POV could hold onto pending intersection tests until there are 
enough to fill a buffer... but the data wouldn't be distributed the way 
that GPUs want it.

That is, POV would still have a group of independent intersection tests, 
each one with different parameters.

GPUs work by saying, "Run this shader, with the first parameter 
interpolated between these two values, and the second parameter 
interpolated between these two other values, and the third parameter 
interpolated between yet another set of values..."

The random data access of POV would render that unworkable.

Of course, I fully admit to not having read the specs for OpenCL, only 
CUDA, so I can't say how much more useful it is.  However, given that I 
believe these to be hardware limitations rather than software, I'd be 
surprised if OpenCL is really that much more powerful (though I've heard 
it's easier to work with).

...Chambers

Post a reply to this message