Yes, I know, POV-Ray wouldn't work well on the GPU. But every now and
then, someone asks about it and, since I've been doing some side
projects with CUDA, I thought I'd put down here what would actually be
necessary for it to be useful.
First off, there is a huge latency hit when moving data between the CPU
and the GPU. This right here is the largest problem; it's simply faster
to use the CPU for many applications because your data size isn't large
In my own experiments, for a simple multiply-add, you need roughly 200
operations for it to be worthwhile.
Going into more detail, I have a laptop with a dual-core I5 running at
3GHz, and a mobile GTX 860 video card (640 CUDA cores). With these two
processors, I found a speed parity at 187 data items. Again, this was a
simple multiply-add using floating point values. If you're doing
anything more complex, it will lower the number necessary; if you're
using double precision, it will increase the number of items needed.
The take-away is that, unless you are performing the same operation on
about 200 items, it will be faster to run it on the CPU than to use the GPU.
The upside though is that the other previous objections (general purpose
programmability, support for double-precision, etc) have basically all
been overcome. With CUDA, OpenCL, and Vulkan available GPUs are easy
enough to program to actually put POV's functions on consumer level GPUs.
What would be necessary, then, for this to be useful for POV-Ray is for
POV to cache function calls in sufficient numbers. The caching mechanism
itself would probably add some overhead, as would switching shader
cores, but with large image sizes it is conceivable that enough calls
could be cached to make it worthwhile. It would be a massive amount of
work in re-writing POV, though.
If POV-Ray ever gets a re-write from scratch, it might happen.
Otherwise, the current method seems fine.
Post a reply to this message