|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
I have just learned about it and I think this could be a great
improvement for POV-ray since uses: CPUs, GPUs and "any other
processors" to process programs. I'll clarify even more just in case:
OpenCL lets you use CPU(s) Core(s) + GPU(s) Core(s) + "Any other
processor(s)/core(s)"; but for most of us is just: CPU(s) Core(s) +
GPU(s) Core(s).
PS: I think it should be a 'Suggestions' newsgroup for this sort of things.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Saul Luizaga schrieb:
> I have just learned about it and I think this could be a great
> improvement for POV-ray since uses: CPUs, GPUs and "any other
> processors" to process programs. I'll clarify even more just in case:
> OpenCL lets you use CPU(s) Core(s) + GPU(s) Core(s) + "Any other
> processor(s)/core(s)"; but for most of us is just: CPU(s) Core(s) +
> GPU(s) Core(s).
(*groans*)
Sorry, but you violated the "suggetions to use GPU to speed up POV-Ray
may (and invariably will) be posted in intervals of 3 months" rule... we
just had this discussion about CUDA.
OpenCL can't change the basic hardware architecture of a GPU; all it
does is provide a layer on top (= added overhead!) to faciliate porting
existing programs to GPUs - still those programs must be suited to the
GPU architecture in order for the added processing power to outweigh the
surplus inter-xPU communication overhead, let alone give any speed benefit.
As soon as POV-Ray is fully capable of running a distributed render on a
network, it *might* be worth investigating whether the render back-end
could also be fully ported to GPUs, to just add some more "machines" to
the team. Until then, I don't think it makes much sense to think about
it (unless we'd want to develop a completely new render engine using a
totally different approach). And even then, question is whether the GPU
would be fast enough at the tasks at hand to pay off the effort of
getting the code to compile for the GPU.
OpenCL, however, will definitely *not* be an option, as it is a subset
of C (the 3.7 code is written in C++), and has severe limitations that
POV-Ray's architecture cannot comply with - most notably that recursions
are not allowed. POV-Ray absolutely positively needs recursions to
handle secondary rays, and also uses recursive calls for nested
definitions of textures pigments or CSG objects.
> PS: I think it should be a 'Suggestions' newsgroup for this sort of things.
povray.pov4.discussion.general might be an appropriate spot.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
clipka <ano### [at] anonymousorg> wrote:
> (*groans*)
I think the standard answer should just be "Go right ahead! Come back when
you're done."
- Ricky
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
triple_r schrieb:
>> (*groans*)
>
> I think the standard answer should just be "Go right ahead! Come back when
> you're done."
No, that would constitute being mean to people just becuase they're not
that deep into that matter... (although it is definitely tempting ;-))
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
triple_r wrote:
> I think the standard answer should just be "Go right ahead! Come back when
> you're done."
>
> - Ricky
standard? standard retarded answer maybe... as I'm proposing this to the
POV-Team not saying it must be implemented, nor I'm implying I'd
implement it.
People that doesn't have any pro-active opinions to contribute to the
discussion should abstain from give it.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
clipka wrote:
> Saul Luizaga schrieb:
> (*groans*)
Way to go to start a discussion...
> Sorry, but you violated the "suggetions to use GPU to speed up POV-Ray
> may (and invariably will) be posted in intervals of 3 months" rule... we
> just had this discussion about CUDA.
>
> OpenCL can't change the basic hardware architecture of a GPU; all it
> does is provide a layer on top (= added overhead!) to faciliate porting
> existing programs to GPUs - still those programs must be suited to the
> GPU architecture in order for the added processing power to outweigh the
> surplus inter-xPU communication overhead, let alone give any speed benefit.
I think you are wrong: "OpenCL (Open Computing Language) greatly
improves speed and responsiveness for a wide spectrum of applications in
numerous market categories from gaming and entertainment to scientific
and medical software."
From here:
http://www.khronos.org/news/press/releases/the_khronos_group_releases_opencl_1.0_specification/
Have you appreciated first hand that overhead making it inviable for
POV-Ray?
> As soon as POV-Ray is fully capable of running a distributed render on a
> network, it *might* be worth investigating whether the render back-end
> could also be fully ported to GPUs, to just add some more "machines" to
> the team. Until then, I don't think it makes much sense to think about
> it (unless we'd want to develop a completely new render engine using a
> totally different approach). And even then, question is whether the GPU
> would be fast enough at the tasks at hand to pay off the effort of
> getting the code to compile for the GPU.
"Tony Tamasi, senior vice president of technical marketing at NVIDIA
powerful way to harness the enormous processing capabilities of our
CUDA-based GPUs on multiple platforms." From the same link.
Some GPGPU provide 64-bit Floating Point computing wich is, I think, the
major concern baout raytracing.At the seems some CUDA capable GPUs does
and for sure some ATI's too: ATI Stream Technology
(http://www.amd.com/US/PRODUCTS/TECHNOLOGIES/STREAM-TECHNOLOGY/Pages/stream-technology.aspx).
>
> OpenCL, however, will definitely *not* be an option, as it is a subset
> of C (the 3.7 code is written in C++), and has severe limitations that
> POV-Ray's architecture cannot comply with - most notably that recursions
> are not allowed. POV-Ray absolutely positively needs recursions to
> handle secondary rays, and also uses recursive calls for nested
> definitions of textures pigments or CSG objects.
Granted, this new C standard (C99) is not fully supported in any C++
implementations; Intel C++ supports it for the most part but not fully.
But I think a port to C++ probably is in the making since C++ is by
far more popular than C99 IMHO, so I think, since it has been released
about 8 months ago, maybe there is a C++ ported OpenCL spec or maybe
more by now. Many computing intensive apps. would want this for themselves.
As you may know I participate in WCG (www.worldcommunitygrid.org) a
scientific research worldwide computing grid for protein-based disease
research like AIDS, Cancer, Human Proteome Folding, etc. There is a
GPU-based implementation of the Human Proteome Folding project and it
makes great progress, so I know it works well.
OK, maybe is not as suitable for raytracing as it is for protein folding
research, maybe the explanation why not is in the discussion about CUDA
is where the answer is, but maybe is worth it because it has 64-bit
Floating Point computing, which IIRC is the one and only big obstacle to
avoid GPU-aided raytracing.
Or what I'm missing? don't want any details, only the highlights if you
care to answer.
>
>> PS: I think it should be a 'Suggestions' newsgroup for this sort of
>> things.
>
> povray.pov4.discussion.general might be an appropriate spot.
thanks, I think I'll try povray.programming.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Saul Luizaga wrote:
> Or what I'm missing? don't want any details, only the highlights if you
> care to answer.
Double precision floating point was only the first objection which held
things up. For several years we've been saying that it's "the reason",
but that's just to simplify things as the other reasons don't matter if
this first one remains unresolved.
Of course, modern GPUs now allow double precision, so we can get to the
other objections now. Specifically:
1) Recursion. As clipka (Christian?) wrote, it is absolutely essential
for POV.
2) Data parallelization versus code parallelization (this is related to
the first, but is not strictly the same).
The ray tracing algorithm follows drastically different code branches on
a single set of data, based on recursion (reflections & refractions), as
well as the other various computations needed (texture calculation,
light source occlusion, etc) which almost all need access to the entire
scene.
Modern CPUs are great for this, as each core is essentially independent,
yet all may access a common memory pool (the RAM).
Modern GPUs are not only poor for this, but physically incapable of
running this way. They are built to run simple procedures on massive
sets of parallel data (such as rasterization).
Have you ever written a shader for a GPU? They're extremely limited,
and you must run *the same shader* on each parallel core. Whereas each
core of your CPU can be running an entirely different procedure.
In other words, GPUs are precisely the wrong solution for POV, while
CPUs are perfect for it.
...Chambers
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Saul Luizaga schrieb:
> clipka wrote:
>> (*groans*)
>
> Way to go to start a discussion...
Sure, but you're really not the first one, and haven't been so recently.
> I think you are wrong: "OpenCL (Open Computing Language) greatly
> improves speed and responsiveness for a wide spectrum of applications in
> numerous market categories from gaming and entertainment to scientific
> and medical software."
>
> From here:
>
http://www.khronos.org/news/press/releases/the_khronos_group_releases_opencl_1.0_specification/
That's a nice statement. Where does it originate from?
Ah, a paper from the group that designed OpenCL to the press. What could
be their major goal with such a paper? They're not possibly trying
primarily to get attention to that thing? Right, sure they wouldn't want
to hype that thing.
Also note that...
- "a wide spectrum of applications" is a very vague statement, and may
exclude some.
- The categories mentioned all have one thing in common: Massive number
crunching with few decision making.
POV-Ray does number crunching too, in a sense, but there's a lot of
desision making involved.
> Have you appreciated first hand that overhead making it inviable for
> POV-Ray?
How could I? Do you have an OpenCL implementation available for me so
that I could test it?
But I have read about some limitations of GPU processing in general and
with regard to raytracing in particular, and imagine to have enough
understanding of computer architecture to be able to say that data
exchange between CPU and GPU requires a tad more overhead than
inter-process communication between separate threads running on the same
CPU.
> "Tony Tamasi, senior vice president of technical marketing at NVIDIA
> powerful way to harness the enormous processing capabilities of our
> CUDA-based GPUs on multiple platforms." From the same link.
Another marketing blurp. Of /course/ the vice president of a big player
in the GPU market is advertising it as the greatest invention since
sliced bread: It will sell more of their chips.
> Some GPGPU provide 64-bit Floating Point computing wich is, I think, the
> major concern baout raytracing.
It used to be one of the major ones, and particularly easy to explain,
though it's a limitation that is gradually disappearing. I named some
others in my previous post.
> Granted, this new C standard (C99) is not fully supported in any C++
> implementations; Intel C++ supports it for the most part but not fully.
> But I think a port to C++ probably is in the making since C++ is by
> far more popular than C99 IMHO, so I think, since it has been released
> about 8 months ago, maybe there is a C++ ported OpenCL spec or maybe
> more by now. Many computing intensive apps. would want this for
> themselves.
I doubt that C++ support is to come anytime soon, given that OpenCL is
even more limited than C99: No function pointers for instance. How could
you possibly implement polymorphic objects if you don't even have
function pointers at your disposal?
If a standard imposes limitations which are more rigorous than what
you'll find on most brain-dead embedded microcontrollers, then there's a
hardware reason for it.
> OK, maybe is not as suitable for raytracing as it is for protein folding
> research, maybe the explanation why not is in the discussion about CUDA
> is where the answer is, but maybe is worth it because it has 64-bit
> Floating Point computing, which IIRC is the one and only big obstacle to
> avoid GPU-aided raytracing.
It used to be the Big One that used to be mentioned first whenever the
discussion popped up again, and possibly the only thing the POV-Ray
developers really cared about, historically: Without support for
double-precision floating point, there was no point in having any closer
look at GPUs. Fortunately for scientific simulations (like that protein
folding thing), the precision issue is improving now (probably /because/
the GPU developers want to go for that scientific sim market share).
Howver, other limitations still apply, which are no issue for such use
cases, but a problem for POV-Ray.
> Or what I'm missing? don't want any details, only the highlights if you
> care to answer.
No support for recursion is one I named already.
Another one is that GPUs are highly optimized for massively parallel
computations where exactly the same program with exactly the same
control flow is run on a vast number of data sets (which is why they
/can/ be so fast on this type of problems in the first place), but they
can /only/ run programs of this type; so if program flow must be
expected to change from one data set to the next, each data set must be
run on its own, along with (for instance) 31 "empty" data sets: You lose
97% of your processing power. That does not leave much.
Massive parallelization /could/ be used for the primary rays in a scene.
However, those are not the problem anyway: You only have a few million
of those, and sophisticated bounding and caching typically keep the
workload per ray low. It's usually the secondary rays (testing for
shadows, following reflected and refracted rays, and some such) that eat
most of the time.
> thanks, I think I'll try povray.programming.
I had actually and honestly hoped to discourage you with my initial
groaning. I guess the POV-Ray dev team is better informed about GPU
computing than you expect.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Chambers schrieb:
> As clipka (Christian?) wrote,
Close to it: Christoph.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Chambers wrote:
> Saul Luizaga wrote:
>> Or what I'm missing? don't want any details, only the highlights if
>> you care to answer.
>
> Double precision floating point was only the first objection which held
> things up. For several years we've been saying that it's "the reason",
> but that's just to simplify things as the other reasons don't matter if
> this first one remains unresolved.
>
> Of course, modern GPUs now allow double precision, so we can get to the
> other objections now. Specifically:
>
> 1) Recursion. As clipka (Christian?) wrote, it is absolutely essential
> for POV.
I suppose this is unsolvable without an C++ ported OpenCL.
> 2) Data parallelization versus code parallelization (this is related to
> the first, but is not strictly the same).
they say "an API for coordinating data and task-based parallel
computation...", this doesn't help? If it could do both maybe would be
of use for POV-Ray.
> The ray tracing algorithm follows drastically different code branches on
> a single set of data, based on recursion (reflections & refractions), as
> well as the other various computations needed (texture calculation,
> light source occlusion, etc) which almost all need access to the entire
> scene.
>
> Modern CPUs are great for this, as each core is essentially independent,
> yet all may access a common memory pool (the RAM).
>
> Modern GPUs are not only poor for this, but physically incapable of
> running this way. They are built to run simple procedures on massive
> sets of parallel data (such as rasterization).
>
> Have you ever written a shader for a GPU? They're extremely limited,
> and you must run *the same shader* on each parallel core. Whereas each
> core of your CPU can be running an entirely different procedure.
>
> In other words, GPUs are precisely the wrong solution for POV, while
> CPUs are perfect for it.
>
> ....Chambers
I see... maybe if GPGPUs are not use as co-processors but as an
auxiliary co-processor that is called on demand, if GPU compliant
procedure needs to be processed. Are absolutely sure there isn't a case
where a GPU can help? maybe in the middle of a rendering/parsing?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
|
|