POV-Ray: Newsgroups: povray.off-topic: Suggestion: OpenCL

POV-Ray : Newsgroups : povray.off-topic : Suggestion: OpenCL		Server Time 5 Jul 2025 04:11:58 EDT (-0400)

<<< Previous 10 Messages

Goto Latest 10 Messages

Next 10 Messages >>>

From: clipka
Subject: Re: Suggestion: OpenCL
Date: 14 Aug 2009 20:16:59
Message: <4a85fe7b$1@news.povray.org>

Saul Luizaga schrieb:
>> 1) Recursion.  As clipka (Christian?) wrote, it is absolutely 
>> essential for POV.
> 
> I suppose this is unsolvable without an C++ ported OpenCL.

???
Recursion is not a feature of C++, it is also a part of standard C99.

>> 2) Data parallelization versus code parallelization (this is related 
>> to the first, but is not strictly the same).
> 
> they say "an API for coordinating data and task-based parallel 
> computation...", this doesn't help? If it could do both maybe would be 
> of use for POV-Ray.

Did you actually /read/ the spec - or just the enthusiastic introduction?

Sure, it does support task-based parallel computation - why? Probably 
because it also targets classic multi-core CPUs, which are ideally 
suited to task-based parallel computing.

GPUs perform very poorly with task-based parallel computations, due to 
their hardware architecture. A software abstraction layer won't change 
that fundamental limitation.

> I see... maybe if GPGPUs are not use as co-processors but as an 
> auxiliary co-processor that is called on demand, if GPU compliant 
> procedure needs to be processed. Are absolutely sure there isn't a case 
> where a GPU can help? maybe in the middle of a rendering/parsing?

No.

POV-Ray's internal workflow does not support asynchronous computations 
(other than having multiple threads independently render parts of the 
image), so only blocking "calls" to the GPU would be of any use, putting 
the CPU task in waiting state in the meantime. Therefore, only portions 
of the code that can be computed /significantly/ faster by the GPU, or 
have any /significant/ size, would warrant "outsourcing" of 
computations, otherwise parameter passing and task switching overhead 
would bog down performance instead of increasing it.

But the only sections of POV-Ray code that do ask for parallelization 
are RGB color and 3D vector computations, or similarly-sized problems; 
these can be parallelized quite well on modern CPUs as well using SSE2 
(i.e. the GPU will not be much faster), and are heavily intermixed with 
conditional branching (i.e. the size of outsourceable work packages is 
very small)

Post a reply to this message

From: clipka
Subject: Re: Suggestion: OpenCL
Date: 14 Aug 2009 20:31:37
Message: <4a8601e9$1@news.povray.org>

Saul Luizaga schrieb:
>> Another possibility is to run the main renderer on the CPU, adding 
>> rays to queues, and sending any "sufficiently large" queues to the GPU 
>> for processing. I don't know if bandwidth limitations between the two 
>> would make this viable...
> 
> Exactly, that is why I asked: "Are absolutely sure there isn't a case 
> where a GPU can help? maybe in the middle of a rendering/parsing?".

Note that although the approach /may/ (!) work, it is a /fundamentally/ 
different approach from what POV-Ray is doing.

Changing POV-Ray to use that approach would imply virtually a complete 
rewrite of the render engine.

> As you can see, maybe bandwidth it isn't much of an issue since The 
> transfer between the PCIe video card and the Main memory can me made at 
> 5 GT/s. Is this still insufficient for POV-Ray peak performance?

So you're looking at peak data transfer rate limits and from them can 
infer that transfer between CPU and GPU memory space is not an issue?

Did you consider latency issues, or the overhead created by the OpenCL 
framework itself? How about the latency for a "function call"?

If your work packages are large enough, then these are no issues. But in 
a raytracer, be prepared for rather small work packages.

Post a reply to this message

From: clipka
Subject: Re: Suggestion: OpenCL
Date: 14 Aug 2009 20:59:48
Message: <4a860884@news.povray.org>

Saul Luizaga schrieb:
> So, you assume that is just a huge amount of hype, and even if it works 
> for other apps., won't for POV-Ray. I think there are more to analyze 
> than just simple generalizations.

I didn't see much analyzing on your side - only enthusiasm about what 
they write in the introduction. It appears to me that you don't know 
much about POV-Ray's architecture or GPU architecture, and are actually 
not too familiar with the OpenCL framework either.

I'm not generalizing. I did have a look at the paper, and analyzed it 
(and POV-Ray's program structure) in enough detail to present hard facts 
why POV-Ray and OpenCL won't mix, at least not at the time being.


 > Anyway, is a POV/TAG-Team decision, we can merely speculate.

No, we can say for sure that they will /not/ go for OpenCL anytime near.


>> I had actually and honestly hoped to discourage you with my initial 
>> groaning. I guess the POV-Ray dev team is better informed about GPU 
>> computing than you expect.
> 
> And you know what I expect the POV/TAG-Team to know... you assume too 
> much...

 From your postings, it is quite obvious that you assumed OpenCL to be 
something brand new, with you being the first in all the POV-Ray 
community to discover it.

FYI: OpenCL has been mentioned in the POV-Ray NGs half a year ago already.


> groaning is an emotional response and as such, irrational: I haven't 
> been  reading this NG for a long time and the first line of text I read 
> was your groaning, besides of the rudeness, it explains nothing, and 

And that's exactly why I issued this loud groan: You /obviously/ didn't 
even bother to check whether anyone ever brought up the topic of GPU 
usage any time recently, or you would have noticed that it had just been 
on the agenda again, in the incarnation of CUDA.

Even after you were informed that the topic had been discussed just 
recently, you also obviously haven't bothered to check the thread on 
that topic, to get yourself updated on the problems of using GPUs with 
POV-Ray, and compare how that applies to OpenCL.

You just came stumbling in, hollering "hey, I got this great perfect 
idea for POV-Ray...", as if the people developing POV-Ray would never 
have heard of it otherwise. Even upon receiving a not-so-positive 
reaction indicating that you /might/ be wrong, you stuck to that 
attitude, hollering right away again in p.programming in the same style. 
There's rudeness in that, too.

Post a reply to this message

From: clipka
Subject: Re: Suggestion: OpenCL
Date: 14 Aug 2009 21:13:35
Message: <4a860bbf$1@news.povray.org>

Saul Luizaga schrieb:
> Well instead of groaning you can make a small .txt file in your PC: 
> "Alrady discussed, conclusions were:
> 1)....
> 2)...
> 3)..."
> or something like that, to avoid frustration and redundancy.

What, and have it clutter my hard disk, and three months later when it's 
really needed I don't remember where I put it?

Furthermore, as is just being demonstrated, people don't take the 
explanations and leave it be, but start discussions about it. Or they 
don't understand the explanations and ask. Or...


> I don't think my ideas are revolutionary, nor new, nor ingenious, I'm 
> just suggesting something that MAY or MAY NOT have not been discussed 
> before.

Next time you're in that situation, you might as well /check/ if it has 
been discussed before.

> Also I assume everyone here knows more than me, including the 
> POV/TAG-Team, so this is more of a hint than a suggestion. Sometimes 
> smart people forget about simple things.

Reread your original post. It doesn't sound like what you describe now.


> I see, I know POV-Ray source code is HUGE and any minor changes 
> represent big efforts.

Not really. Major changes represent major efforts. As of now, GPU 
support would be a more-than-major effort.

/Some/ changes that look like minor ones do require big efforts due to 
how the internal structure happens to be. Others appear like a major one 
  but are actually a piece of cake to implement.


> Maybe there is a use for it, not as another main processor but as 
> secondary one. I posted about it in another post.

Maybe it's worth reconsidering in a year or so.

Post a reply to this message

From: Chambers
Subject: Re: Suggestion: OpenCL
Date: 14 Aug 2009 23:57:45
Message: <4a863239$1@news.povray.org>

clipka wrote:
> If your work packages are large enough, then these are no issues. But in 
> a raytracer, be prepared for rather small work packages.

And that's another problem.  While a GPU has *hundreds* of cores that 
all work in parallel, in raytracing you're unlikely to encounter more 
than a few dozen rays which require the same test.  It's just not 
efficient to solve this problem with GPUs.

Note that, while GPGPU has come a long way in recent years, the only way 
to make it more useful for POV is to make it more like a CPU and less 
like a GPU.  While GPUs in general may go this route in the future (Tim 
Sweeney thinks so), really GPUs will focus first and foremost on what 
games need... and games don't need raytracing yet.

...Chambers

Post a reply to this message

From: Chambers
Subject: Re: Suggestion: OpenCL
Date: 15 Aug 2009 00:01:59
Message: <4a863337$1@news.povray.org>

Saul Luizaga wrote:
> So, you assume that is just a huge amount of hype, and even if it works 
> for other apps., won't for POV-Ray.

NB: There are some apps, which theoretically *should* benefit massively 
from GPGPU, which are actually hurt by it.

Case in point: TMPGEnc, using CUDA.  Video encoding has long been touted 
as one of the best areas for improvement via GPGPU.

On a GeForce 260, it's actually *slower* than a Core 2 Quad 6600 (2.4gHz).

Imagine a dual-processor six-core Xeon system (12 cores, 24 threads) and 
you'd need a few dozen GPUs (running literally *thousands* of threads) 
to match the performance.

...Chambers

Post a reply to this message

From: Orchid XP v8
Subject: Re: Suggestion: OpenCL
Date: 15 Aug 2009 02:07:42
Message: <4a8650ae@news.povray.org>

clipka wrote:

> Note that although the approach /may/ (!) work, it is a /fundamentally/ 
> different approach from what POV-Ray is doing.
> 
> Changing POV-Ray to use that approach would imply virtually a complete 
> rewrite of the render engine.

Agreed. It's no small task, by any stretch of the imagination...

-- 
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*

Post a reply to this message

From: Orchid XP v8
Subject: Re: Suggestion: OpenCL
Date: 15 Aug 2009 02:10:58
Message: <4a865172@news.povray.org>

>> If your work packages are large enough, then these are no issues. But 
>> in a raytracer, be prepared for rather small work packages.
> 
> And that's another problem.  While a GPU has *hundreds* of cores that 
> all work in parallel, in raytracing you're unlikely to encounter more 
> than a few dozen rays which require the same test.  It's just not 
> efficient to solve this problem with GPUs.

I'm not sure I agree with these statements.

Tracing a scene involves shooting thousands, or even tens of thousands 
of rays at the top-level geometry. While there may be some objects which 
are quite small, most objects will be hit by hundreds and hundreds of rays.

Now, whether it's efficient to run the entire rendering engine on the 
GPU is debatable, but it would seem to be that it should be possible to 
achieve *some* speedup by running at least part of the tracing process 
on the GPU. (The only question is bandwidth.)

> Note that, while GPGPU has come a long way in recent years, the only way 
> to make it more useful for POV is to make it more like a CPU and less 
> like a GPU.  While GPUs in general may go this route in the future (Tim 
> Sweeney thinks so), really GPUs will focus first and foremost on what 
> games need... and games don't need raytracing yet.

More to the point, if a GPU was more like a CPU, it wouldn't be faster 
than a CPU.

-- 
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*

Post a reply to this message

From: Chambers
Subject: Re: Suggestion: OpenCL
Date: 15 Aug 2009 04:03:15
Message: <4a866bc3$1@news.povray.org>

Orchid XP v8 wrote:
>>> If your work packages are large enough, then these are no issues. But 
>>> in a raytracer, be prepared for rather small work packages.
>>
>> And that's another problem.  While a GPU has *hundreds* of cores that 
>> all work in parallel, in raytracing you're unlikely to encounter more 
>> than a few dozen rays which require the same test.  It's just not 
>> efficient to solve this problem with GPUs.
> 
> I'm not sure I agree with these statements.
> 
> Tracing a scene involves shooting thousands, or even tens of thousands 
> of rays at the top-level geometry. While there may be some objects which 
> are quite small, most objects will be hit by hundreds and hundreds of rays.

Let's say a scene rendered for display on a 1080p TV has approximately 
300 objects in it, each one covering a similar portion of the display 
(for simplicity's sake).

That's a total of 1,920x1,080=2,073,600 pixels, or 6,912 for each 
object.  Admittedly, this is more than my statement of "a few dozen" :) 
  However, each level of recursion will split the groups rays - let's 
say into 4 groups (I honestly think they would be grouped in more, 
smaller, groups, but this will make the math easier).  After only 4 
levels of recursion (a value entirely common in scenes) the rays are 
bundled in groups of 25 (assuming an even distribution, of course).

With a smaller image, more objects, and less tight grouping, you'd get 
such small groups after only one or two levels of recursion.

> More to the point, if a GPU was more like a CPU, it wouldn't be faster 
> than a CPU.

True ;)

...Chambers

Post a reply to this message

From: Orchid XP v8
Subject: Re: Suggestion: OpenCL
Date: 15 Aug 2009 04:16:09
Message: <4a866ec9$1@news.povray.org>

> After only 4 levels of 
> recursion (a value entirely common in scenes) the rays are bundled in 
> groups of 25 (assuming an even distribution, of course).
> 
> With a smaller image, more objects, and less tight grouping, you'd get 
> such small groups after only one or two levels of recursion.

That seems small enough that it might be worth transferring back to the CPU.

But if you think about common scenes where you have walls and floors and 
so forth, it might be worth using the GPU to test ray intersections 
against these, and against bounding volumes (if you're using them). Huge 
numers of rays need to be run through these tests, so the GPU can fire 
those off quite quickly. It might also be worth running the "so what the 
hell is the colour of this surface?" calculation on the GPU - what with 
texturing, normal maps, multiple light rays of different angles and 
colours, etc.

Also, let's clear this up: My understanding is that the GPU does not 
require *all* cores to run an identical kernel. IIRC, the cores are 
grouped into (fairly large) bundles, each bundle runs a single kernel, 
but different bundles can run completely different kernels. So you don't 
need a ray queue with enough rays for the entire GPU, just for a whole 
bundle of cores.

-- 
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*

Post a reply to this message

<<< Previous 10 Messages

Goto Latest 10 Messages

Next 10 Messages >>>