POV-Ray: Newsgroups: povray.programming: Small update on the state of GPGPU: Small update on the state of GPGPU

POV-Ray : Newsgroups : povray.programming : Small update on the state of GPGPU : Small update on the state of GPGPU		Server Time 18 Jul 2025 18:49:40 EDT (-0400)

From: Benjamin Chambers
Date: 23 Jan 2017 11:24:24
Message: <58862e38$1@news.povray.org>

Yes, I know, POV-Ray wouldn't work well on the GPU. But every now and 
then, someone asks about it and, since I've been doing some side 
projects with CUDA, I thought I'd put down here what would actually be 
necessary for it to be useful.

First off, there is a huge latency hit when moving data between the CPU 
and the GPU. This right here is the largest problem; it's simply faster 
to use the CPU for many applications because your data size isn't large 
enough.

In my own experiments, for a simple multiply-add, you need roughly 200 
operations for it to be worthwhile.
Going into more detail, I have a laptop with a dual-core I5 running at 
3GHz, and a mobile GTX 860 video card (640 CUDA cores). With these two 
processors, I found a speed parity at 187 data items. Again, this was a 
simple multiply-add using floating point values. If you're doing 
anything more complex, it will lower the number necessary; if you're 
using double precision, it will increase the number of items needed.

The take-away is that, unless you are performing the same operation on 
about 200 items, it will be faster to run it on the CPU than to use the GPU.

The upside though is that the other previous objections (general purpose 
programmability, support for double-precision, etc) have basically all 
been overcome. With CUDA, OpenCL, and Vulkan available GPUs are easy 
enough to program to actually put POV's functions on consumer level GPUs.

What would be necessary, then, for this to be useful for POV-Ray is for 
POV to cache function calls in sufficient numbers. The caching mechanism 
itself would probably add some overhead, as would switching shader 
cores, but with large image sizes it is conceivable that enough calls 
could be cached to make it worthwhile. It would be a massive amount of 
work in re-writing POV, though.

If POV-Ray ever gets a re-write from scratch, it might happen. 
Otherwise, the current method seems fine.

Post a reply to this message