|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Yes, I know, POV-Ray wouldn't work well on the GPU. But every now and
then, someone asks about it and, since I've been doing some side
projects with CUDA, I thought I'd put down here what would actually be
necessary for it to be useful.
First off, there is a huge latency hit when moving data between the CPU
and the GPU. This right here is the largest problem; it's simply faster
to use the CPU for many applications because your data size isn't large
enough.
In my own experiments, for a simple multiply-add, you need roughly 200
operations for it to be worthwhile.
Going into more detail, I have a laptop with a dual-core I5 running at
3GHz, and a mobile GTX 860 video card (640 CUDA cores). With these two
processors, I found a speed parity at 187 data items. Again, this was a
simple multiply-add using floating point values. If you're doing
anything more complex, it will lower the number necessary; if you're
using double precision, it will increase the number of items needed.
The take-away is that, unless you are performing the same operation on
about 200 items, it will be faster to run it on the CPU than to use the GPU.
The upside though is that the other previous objections (general purpose
programmability, support for double-precision, etc) have basically all
been overcome. With CUDA, OpenCL, and Vulkan available GPUs are easy
enough to program to actually put POV's functions on consumer level GPUs.
What would be necessary, then, for this to be useful for POV-Ray is for
POV to cache function calls in sufficient numbers. The caching mechanism
itself would probably add some overhead, as would switching shader
cores, but with large image sizes it is conceivable that enough calls
could be cached to make it worthwhile. It would be a massive amount of
work in re-writing POV, though.
If POV-Ray ever gets a re-write from scratch, it might happen.
Otherwise, the current method seems fine.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 1/23/2017 9:24 AM, Benjamin Chambers wrote:
> If POV-Ray ever gets a re-write from scratch, it might happen.
> Otherwise, the current method seems fine.
Alternatively, we could try implementing it for very specific cases.
For instance, running hit tests against objects... If you have an area
light, you could run large arrays of hit tests (16x16 or larger would
probably benefit from this) against individual objects to see if they
obscure the light source.
Also, as a first pass on the scene, you could run hit tests against
every single object, and generate a map of the intersections.
Both of those would, of course, be run AFTER bounding-box tests, to skip
objects entirely.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 01/23/2017 12:06 PM, Benjamin Chambers wrote:
> On 1/23/2017 9:24 AM, Benjamin Chambers wrote:
>> If POV-Ray ever gets a re-write from scratch, it might happen.
>> Otherwise, the current method seems fine.
>
> Alternatively, we could try implementing it for very specific cases.
>
> For instance, running hit tests against objects... If you have an area
> light, you could run large arrays of hit tests (16x16 or larger would
> probably benefit from this) against individual objects to see if they
> obscure the light source.
>
> Also, as a first pass on the scene, you could run hit tests against
> every single object, and generate a map of the intersections.
>
> Both of those would, of course, be run AFTER bounding-box tests, to skip
> objects entirely.
>
Thanks for posting your experience.
One of the places I've wondered if we might sneak a look at GPUs near
term would be as one or more, new internal functions for use with
isosurfaces. Specifically isosurfaces needing to evaluate a great many
functions for each evaluation - something not practical today.
Last I looked for my Ubuntu platform (14.04), OpenCL looked messy to
install and run. I bailed before attempting anything!
Bill P.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Am 26.01.2017 um 19:51 schrieb William F Pokorny:
> One of the places I've wondered if we might sneak a look at GPUs near
> term would be as one or more, new internal functions for use with
> isosurfaces. Specifically isosurfaces needing to evaluate a great many
> functions for each evaluation - something not practical today.
If the "sub-functions", as I would like to call them, are different in
structure (or even if their common structure can't be easily identified
as such), I suspect that's not a scenario where a GPGPU can help much.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 01/26/2017 03:08 PM, clipka wrote:
> Am 26.01.2017 um 19:51 schrieb William F Pokorny:
>
> If the "sub-functions", as I would like to call them, are different in
> structure (or even if their common structure can't be easily identified
> as such), I suspect that's not a scenario where a GPGPU can help much.
>
I agree.
It was the point list as function-origins idea (the voronoi-ish results
I posted a year or so ago) I had foremost in mind. In other words there
would be many of the same function each with a unique origin in the
simplest case. Perhaps to narrow an application practically, but as
something with which to experiment with GPUs maybe OK...
Bill P.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 1/26/2017 1:08 PM, clipka wrote:
> Am 26.01.2017 um 19:51 schrieb William F Pokorny:
>
>> One of the places I've wondered if we might sneak a look at GPUs near
>> term would be as one or more, new internal functions for use with
>> isosurfaces. Specifically isosurfaces needing to evaluate a great many
>> functions for each evaluation - something not practical today.
>
> If the "sub-functions", as I would like to call them, are different in
> structure (or even if their common structure can't be easily identified
> as such), I suspect that's not a scenario where a GPGPU can help much.
>
It's certainly possible; there are frameworks out there (CudaDNN, CNTK,
and TensorFlow all come to mind, because of the projects I've been
working on) that all take scripted input and run the final function on
the GPU.
I suspect the above frameworks have pre-defined shaders for various
common functions, and merely pass the data between them based on your
script. However, it would be entirely possible to compile iso functions
into shader code (Cuda C or the equivalent in OpenCL) and let the driver
load it onto the GPU for you.
But it wouldn't be worth it, unless you can solve the first problem I
posted in my original post: generating a situation where you have a
cache of at least 100 (preferably 200 or more) function calls.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 1/26/2017 1:08 PM, clipka wrote:
> Am 26.01.2017 um 19:51 schrieb William F Pokorny:
>
>> One of the places I've wondered if we might sneak a look at GPUs near
>> term would be as one or more, new internal functions for use with
>> isosurfaces. Specifically isosurfaces needing to evaluate a great many
>> functions for each evaluation - something not practical today.
>
> If the "sub-functions", as I would like to call them, are different in
> structure (or even if their common structure can't be easily identified
> as such), I suspect that's not a scenario where a GPGPU can help much.
>
It's certainly possible; there are frameworks out there (CudaDNN, CNTK,
and TensorFlow all come to mind, because of the projects I've been
working on) that all take scripted input and run the final function on
the GPU.
I suspect the above frameworks have pre-defined shaders for various
common functions, and merely pass the data between them based on your
script. However, it would be entirely possible to compile iso functions
into shader code (Cuda C or the equivalent in OpenCL) and let the driver
load it onto the GPU for you.
But it wouldn't be worth it, unless you can solve the first problem I
posted in my original post: generating a situation where you have a
cache of at least 100 (preferably 200 or more) function calls.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
|
|