|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
So, CUDA and accompanying hardware has made great strides the past few years,
including double-precision float and recursion support. How about having support
for CUDA or OpenCL?
OTOH, if and when Intel's Knight's Corner comes out, porting to graphics cards
may be a moot point.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Am 16.12.2011 21:15, schrieb jhu:
> So, CUDA and accompanying hardware has made great strides the past few years,
> including double-precision float and recursion support. How about having support
> for CUDA or OpenCL?
>
> OTOH, if and when Intel's Knight's Corner comes out, porting to graphics cards
> may be a moot point.
I guess the timeline for POV-Ray proper will be as follows:
Step 1: Add multiprocessor support. (Done.)
Step 2: Add general multi-node rendering support. (Pending. From what
little I know of Knight's Corner, this would also be needed to utilize KC.)
Step 3: Port the rendering engine to CUDA, if still of interest by that
time.
Of course that doesn't bar any third party from porting POV-Ray to CUDA
right now.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
clipka <ano### [at] anonymousorg> wrote:
> Step 3: Port the rendering engine to CUDA, if still of interest by that
> time.
>
You know, the way things are heading in the GPU space, perhaps that may not be
necessary. AMD just announced their next GPU architecture:
http://www.anandtech.com/show/4455/amds-graphics-core-next-preview-amd-architects-for-compute
From page 6:
"In terms of base features the biggest change will be that GCN will implement
the underlying features necessary to support C++ and other advanced languages.
As a result GCN will be adding support for pointers, virtual functions,
exception support, and even recursion. These underlying features mean that
for the GPU, allowing them to more easily program for the GPU and CPU within the
same application."
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Hello.
OpenCL can now calculate in double-precision (like PovRay) with an extension:
http://www.bealto.com/gpu-fft2_real-type.html
Thanks.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
I think OpenCL is the way to go.
Only in this case any AMD user can use it as well as any NVIDIA user.
Alos OpenCL is not only working on grpahics cards but it does work on CPU's and
many other.
An openCL Renderer will be very portable.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Am 23.01.2016 um 15:51 schrieb Theogott:
> I think OpenCL is the way to go.
> Only in this case any AMD user can use it as well as any NVIDIA user.
> Alos OpenCL is not only working on grpahics cards but it does work on CPU's and
> many other.
> An openCL Renderer will be very portable.
OpenCL would certainly be my choice at present, too.
That said, you know my current stance on GPU support.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Its possibly nothing that one person can do in the freetime.
architectural changes in the rendering Algo ...
and much time for testing.
Something that does not really pay for a freeware program.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
jhu <nomail@nomail> wrote:
> So, CUDA and accompanying hardware has made great strides the past few years,
> including double-precision float and recursion support. How about having support
> for CUDA or OpenCL?
The problem is that CUDA (and I assume OpenCL) is not just a thousand
generic CPUs that you can run any code you want independently of each
other.
CUDA uses the so-called SIMT design (a bit like SIMD, but a bit different).
This means, roughly, that there's one single stream of executable code
that all the CUDA cores are executing in parallel. Not only does this
mean that all the cores have to run the exact same code (ie. you can't
just run one task in one core and a different in another), this imposes
certain limitations and inefficiencies.
One of the biggest inefficiencies is that conditionals may cause severe
speed penalties. That's because every CUDA core needs to be "in sync"
with each other, when executing that stream of executable code. If some
cores perform the body of a conditional while others don't, those others
need to wait for the ones that do, until they "meet" at a common point.
The longer the conditional body is, the worse the penalty. (Essentially
it's like every core were executing the longest conditional branch, even
if just one core does.)
(Although with regard to that "all cores must run the same code", it might
not be that simple. If I'm not mistaken, graphics cards have, in fact,
several "pipelines" which are able to run independent code in parallel,
using their own portion of the CUDA cores. Something like 8 such
"pipelines", each with 40 CUDA cores, meaning that you can run 8 different
tasks, each task being able to use 40 cores. Or something along those lines.
The main purpose of this is, AFAIK, to be able to render polygons with
different shaders in parallel, but CUDA refurbishes this to run any code
you want.)
There are of course other limitations, such as transferring data between
the main RAM and the GPU's RAM, and such.
Programs that use CUDA need to be specifically designed with these
limitations in mind.
--
- Warp
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |