POV-Ray: Newsgroups: povray.off-topic: Suggestion: OpenCL

POV-Ray : Newsgroups : povray.off-topic : Suggestion: OpenCL		Server Time 28 Oct 2025 04:08:04 EDT (-0400)

<<< Previous 10 Messages

Goto Latest 10 Messages

Next 10 Messages >>>

From: Chambers
Subject: Re: Suggestion: OpenCL
Date: 15 Aug 2009 12:49:01
Message: <4a86e6fd$1@news.povray.org>

Orchid XP v8 wrote:
> But if you think about common scenes where you have walls and floors and 
> so forth, it might be worth using the GPU to test ray intersections 
> against these, and against bounding volumes (if you're using them). Huge 
> numers of rays need to be run through these tests, so the GPU can fire 
> those off quite quickly.

But those tests are also quite simple, so would benefit the least from 
the GPU.  If isosurfaces could be translated efficiently into shaders, 
then those would show the most benefit (and Julia fractals, of course).

> It might also be worth running the "so what the 
> hell is the colour of this surface?" calculation on the GPU - what with 
> texturing, normal maps, multiple light rays of different angles and 
> colours, etc.

Possibly.  In fact, using the GPU for GI might be an option.  For 
instance, we could run a single extremely high resolution pass with no 
lighting or texturing, just intersections, and cache all the 
intersection locations.  Then, feed these intersections to the GPU, and 
have it calculate lighting for those points.  Feed the lighting data 
back to the CPU for radiosity, and voila!  Fast GI :)  (Of course, 
implementing it would be a b*tch, but that's beside the point).

> Also, let's clear this up: My understanding is that the GPU does not 
> require *all* cores to run an identical kernel. IIRC, the cores are 
> grouped into (fairly large) bundles, each bundle runs a single kernel, 
> but different bundles can run completely different kernels. So you don't 
> need a ray queue with enough rays for the entire GPU, just for a whole 
> bundle of cores.

True, I think the shaders are in blocks of 4, and you have to have 
groups of 32 blocks running the same shader or something like that 
(which would be 128 shaders per program).  I don't remember the exact 
numbers, though, and in fact it's probably GPU dependent.

...Chambers

Post a reply to this message

From: Saul Luizaga
Subject: clipka I'll answer you here...
Date: 15 Aug 2009 13:07:11
Message: <4a86eb3f$1@news.povray.org>

clipka wrote:
> Saul Luizaga schrieb:
>>> Another possibility is to run the main renderer on the CPU, adding 
>>> rays to queues, and sending any "sufficiently large" queues to the 
>>> GPU for processing. I don't know if bandwidth limitations between the 
>>> two would make this viable...
>>
>> Exactly, that is why I asked: "Are absolutely sure there isn't a case 
>> where a GPU can help? maybe in the middle of a rendering/parsing?".
> 
> Note that although the approach /may/ (!) work, it is a /fundamentally/ 
> different approach from what POV-Ray is doing.
> 
> Changing POV-Ray to use that approach would imply virtually a complete 
> rewrite of the render engine.
> 
>> As you can see, maybe bandwidth it isn't much of an issue since The 
>> transfer between the PCIe video card and the Main memory can me made 
>> at 5 GT/s. Is this still insufficient for POV-Ray peak performance?
> 
> So you're looking at peak data transfer rate limits and from them can 
> infer that transfer between CPU and GPU memory space is not an issue?
> 
> Did you consider latency issues, or the overhead created by the OpenCL 
> framework itself? How about the latency for a "function call"?
> 
> If your work packages are large enough, then these are no issues. But in 
> a raytracer, be prepared for rather small work packages.

You just don't like to be corrected due to your intellectual vanity and 
arrogance, leave those aside for good, they don't do anybody any good.

I made a search on p.o-t on the subject and found nothing. The question 
was: "are you ABSOLUTELY SURE it won't work...?" First you say NO. then 
MAY (!) work. Obvious contradiction. Then you made a number of 
assumptions (againg) about me, all of them WRONG and this is for sure 
rude. Posting on p.p was only the right thing to do, since it was a 
suggestion I didn't started a discussion there, but here.

On my first post in this thread: get some perspective dude, it was for 
everyone, from little tech knowledge to the most advanced, of course 
most advanced readers can obvious the explanation/clarification I made 
about OpenCL.

To the subject: I don't know much about GPUs, POV-Ray internals, 
programming, nor OpenCL, but I do know a little about Computer 
electronic architecture. Some of the questions I made are not meant for 
this NG readers not even the TAG-Team but to the POV-Team, so they can 
made calculations about OpenCL.

I think they may have a structured and classified diagram of ALL POV-Ray 
programming structures, knowing approximately how much each structure 
would take in the CPU, so from the data I provided they could see:

1) What GPU capable structures.
2) The amount of time that the selected GPU capable structures could 
take to do a given process.
3) And this should be first, see if it is worth it, to consider the GPU 
aid, if not possible through OpenCL, make a custom framework. If it is, 
calculate how many GPU capable structures there are to have a second 
evaluation, and see if that number is worth the effort.
4) ETC.

So I'm analyzing to the best of my abilities as everyone here, including 
you, but you're taking this too emotional and going for personal attacks 
on me trying to discredit me. I'm just trying to see all possibilities 
of GPU acceleration and have a discussion and maybe lighten up the 
analysis a little for the POV-Team, but maybe is not me place nor even 
non POV-TAG-Team people, so maybe I should just had this addressed to 
the POV-Team for their consideration only and avoid so much "bothering" 
people here. But as you can see there are somethings to say about it 
still...

Post a reply to this message

From: Chambers
Subject: Re: clipka I'll answer you here...
Date: 15 Aug 2009 13:51:51
Message: <4a86f5b7$1@news.povray.org>

Saul Luizaga wrote:
> You just don't like to be corrected

This sounds more like you.

Believe it or not, this subject is an official Dead Horse.  Please stop 
beating it, you'll only get a bloody mess.

> I made a search on p.o-t on the subject

First, p.o-t is the wrong forum.
Second, posts here expire.

> was: "are you ABSOLUTELY SURE it won't work...?"

No, you didn't even ask a question.  You made a statement: "I have just 
learned about it and I think this could be a great improvement for POV-ray"

This is something which, by your own words, you have just heard about. 
You then quoted a whole bunch of marketing blurbs about it.

Trust me.  We have not just heard about it, but have been looking at the 
idea for more than *10 years* now.  I haven't specifically looked at 
OpenCL, but I have CUDA, and the underlying hardware hasn't changed 
enough yet to overcome the inherent limitations in the system.

> To the subject: I don't know much about GPUs, POV-Ray internals, 
> programming, nor OpenCL,

Then why is it so hard for you to accept that this isn't a good idea at 
this time?

If you really want to help make POV-Ray better, the source code is 
available.  Please, dig in!  Start reading the p.p group, and look at 
some of the outstanding bugs!  That would surely be appreciated more 
than your parroting some random marketing blurbs.

...Chambers

Post a reply to this message

From: Saul Luizaga
Subject: Re: clipka I'll answer you here...
Date: 15 Aug 2009 14:09:04
Message: <4a86f9c0@news.povray.org>

Chambers wrote:
> Saul Luizaga wrote:
>> You just don't like to be corrected
> 
> This sounds more like you.

right...

> 
> Believe it or not, this subject is an official Dead Horse.  Please stop 
> beating it, you'll only get a bloody mess.
> 
>> I made a search on p.o-t on the subject
> 
> First, p.o-t is the wrong forum.
> Second, posts here expire.
> 
>> was: "are you ABSOLUTELY SURE it won't work...?"
> 
> No, you didn't even ask a question.

news://news.povray.org:119/4a850da7@news.povray.org, check the last 
paragraph, proves you wrong. Did you liked to be corrected?

> 
> This is something which, by your own words, you have just heard about. 
> You then quoted a whole bunch of marketing blurbs about it.
> 
> Trust me.  We have not just heard about it, but have been looking at the 
> idea for more than *10 years* now.  I haven't specifically looked at 
> OpenCL, but I have CUDA, and the underlying hardware hasn't changed 
> enough yet to overcome the inherent limitations in the system.
> 
>> To the subject: I don't know much about GPUs, POV-Ray internals, 
>> programming, nor OpenCL,
> 
> Then why is it so hard for you to accept that this isn't a good idea at 
> this time?
> 
> If you really want to help make POV-Ray better, the source code is 
> available.  Please, dig in!  Start reading the p.p group, and look at 
> some of the outstanding bugs!  That would surely be appreciated more 
> than your parroting some random marketing blurbs.
> 
> ....Chambers

So open minded... thanks, I out of here for good.

Post a reply to this message

From: Chambers
Subject: Re: clipka I'll answer you here...
Date: 15 Aug 2009 16:08:03
Message: <4a8715a3$1@news.povray.org>

Saul Luizaga wrote:
>> No, you didn't even ask a question.
> 
> news://news.povray.org:119/4a850da7@news.povray.org, check the last 
> paragraph, proves you wrong. Did you liked to be corrected?

Yes, I do as a matter of fact.  I don't like being wrong.

...Chambers

Post a reply to this message

From: clipka
Subject: Re: clipka I'll answer you here...
Date: 15 Aug 2009 17:36:54
Message: <4a872a76$1@news.povray.org>

Saul Luizaga schrieb:
> I made a search on p.o-t on the subject and found nothing.

This may be because p.o-t is actually not the best place for such a 
posting, as it is intended for topics /not/ related to POV-Ray.


 > The question
> was: "are you ABSOLUTELY SURE it won't work...?" First you say NO. then 
> MAY (!) work. Obvious contradiction.

You're quoting me out of context, and out of chronological order.

You proposed OpenCL for POV-Ray. If you go back to my first reply, 
you'll find that my answer was manifold back then already - but not 
contradictory (nor were my follow-ups):


(I) *NO*, OpenCL will *not* be an option for POV-Ray, because (a) it is 
written in a different language (namely C++), and (b) the internal 
architecture makes heavy use of constructs not available in OpenCL (most 
particularly recursions).

Both of this still stands. If you have any substantial information that 
make both of these arguments moot (other than wishful thinking about 
where the OpenCL project /may/ be heading some time in the future 
according to your imagination), then go ahead and correct me.

There is another point in which I might be mistaken: My argument is 
based on assumptions that (a) POV-Ray will not go back to C again, and 
(b) the fundamental architecture will not change anytime soon. However, 
given that POV-Ray has /just been/ ported from C to C++, and the feat of 
adding SMP support has /already/ taken something like 3 years, I don't 
expect any strong urge to introduce even more fundamental changes in the 
underlying architecture.

This urge /might/ be greater if there was a clear promise of benefit. 
However, even /that/ is still doubtful.


(II) Yes, it *MIGHT* be possible to run a full-fledged independent 
POV-Ray thread on a GPU; however, this would require some conditions to 
be fulfilled first:

(1) POV-Ray would /first/ have to be brought to the point where it can 
run distributed renders on multiple machines.

The reason here is that in order to run a separate thread on the GPU, 
POV-Ray would need to be enabled to run render threads in fully separate 
(or at least non-synchronized) address spaces. As this is also /the/ 
main prerequisite for distrubuted rendering on multiple machines (in 
this multiprocessing model, communication bandwidth and latency are 
likely to pose not so much of a problem even in the network, let alone 
between CPU and GPU), and network rendering is already a goal POV-Ray is 
committed to, I do /not/ expect GPU support to be tackled any time 
earler: Network rendering comes with a clear promise of a fair gain in 
render speed and usability, while the benefit of GPU rendering is still 
carrying a big question mark.

(2) Someone would have to come up with a GPU programming framework much 
better suited to POV-Ray, which eliminates all the issues in (I), i.e. 
supports C++ and recursions.


(III) With the intention of elaborating on (I), I noted that yes, it 
*MIGHT* be possible to include GPU support via OpenCL use in POV-Ray 
even without changing the fundamental structure of the application, 
*BUT* this would /very/ likely lead to worse performance than without 
the GPU support, due to (a) the inability of POV-Ray to supply the GPU 
with a large enough and parallelizable work package, and (b) the 
inability of POV-Ray to continue while the GPU is processing that work 
package.


As a resume, I still am perfectly sure that OpenCL is *not* currently an 
option to speed up POV-Ray.

Plus, I am perfectly sure that OpenCL will be further revisited now and 
then, and reconsidered as an option, should it be improved to the point 
where it appears to make some sense.


> To the subject: I don't know much about GPUs, POV-Ray internals, 
> programming, nor OpenCL, but I do know a little about Computer 
> electronic architecture. Some of the questions I made are not meant for 
> this NG readers not even the TAG-Team but to the POV-Team, so they can 
> made calculations about OpenCL.

There's not much that can be calculated from your figures, except the 
peak memory transfer rate of some - presumably - representative graphics 
hardware. Program execution speed, however, also depends on the 
particular compiler (and optimizer!) - which are not even available yet.

> I think they may have a structured and classified diagram of ALL POV-Ray 
> programming structures, knowing approximately how much each structure 
> would take in the CPU, so from the data I provided they could see:

That would help in no way, because these structures all rely on an 
architecture that, as already pointed out, is fundamentally incompatible 
with OpenCL-based GPU computing.

> 3) And this should be first, see if it is worth it, to consider the GPU 
> aid, if not possible through OpenCL, make a custom framework. If it is, 
> calculate how many GPU capable structures there are to have a second 
> evaluation, and see if that number is worth the effort.

I don't think that the POV-Ray dev team should even be bothered with 
this, because from all I see, current GPU architectures would appear to 
either require a /monstrous/ amount of framework development to make it 
work for /POV-Ray/ (again, most particularly due to the lack of support 
for recursion, which apparently is a hardware limitation, not a 
framework thing), or provide /very small/ gain or even a disadvantage 
(due to the small size of work packages POV-Ray could supply to the GPU, 
and the lack of POV-Ray to do anything else while waiting for the 
results), or require a /monstrous/ amount of POV-Ray architecture changes.

I don't think the POV-Ray dev team /want/ to be bothered (once again) 
with this anyway.

And even /if/ they wanted to be bothered, /I/ as a POV-Ray user wouldn't 
want them to, because I think there are plenty of features which are 
more important to invest time into, where a gain is clearly visible, 
while the gain of GPU support is still /very/ questionable, for all the 
reasons I (and others) have presented by now.

Post a reply to this message

From: clipka
Subject: Re: clipka I'll answer you here...
Date: 15 Aug 2009 17:53:08
Message: <4a872e44$1@news.povray.org>

Saul Luizaga schrieb:
>>> was: "are you ABSOLUTELY SURE it won't work...?"
>>
>> No, you didn't even ask a question.
> 
> news://news.povray.org:119/4a850da7@news.povray.org, check the last 
> paragraph, proves you wrong. Did you liked to be corrected?

Are you actually aware that Chambers and CLipka are two totally 
different persons? That there are actually /two/ persons trying to 
convince you that OpenCL will get POV-Ray nowhere (with others having 
joined in occasionally)?

Maybe you should re-read all the replies you got, and have a look at who 
actually wrote them, and what you replied to whom.


And Chambers is right: There wasn't a question to start with. You only 
went for a more questioning style some postings into the discussion.

Post a reply to this message

From: Orchid XP v8
Subject: Re: Suggestion: OpenCL
Date: 16 Aug 2009 04:12:41
Message: <4a87bf79$1@news.povray.org>

>> But if you think about common scenes where you have walls and floors 
>> and so forth, it might be worth using the GPU to test ray 
>> intersections against these, and against bounding volumes (if you're 
>> using them). Huge numers of rays need to be run through these tests, 
>> so the GPU can fire those off quite quickly.
> 
> But those tests are also quite simple, so would benefit the least from 
> the GPU.

How do you figure that?

The benefit of the GPU isn't the complexity of the kernels it executes 
(usually quite the converse), it's more the fact that it can do the same 
operations many, many times over simultaneously.

> If isosurfaces could be translated efficiently into shaders, 
> then those would show the most benefit (and Julia fractals, of course).

No arguments here.

>> It might also be worth running the "so what the hell is the colour of 
>> this surface?" calculation on the GPU - what with texturing, normal 
>> maps, multiple light rays of different angles and colours, etc.
> 
> Possibly.  In fact, using the GPU for GI might be an option.

Yes, there's a couple of ways you might implement this, with varying 
degrees of accuracy. For example, you could feed the GPU a polygon mesh 
approximating the scene. (There are already known algorithms for doing 
GI on a polygon mesh.) Or you could just use the GPU to accelerate the 
insane number of ray intersection tests required during GI.

>> Also, let's clear this up: My understanding is that the GPU does not 
>> require *all* cores to run an identical kernel. IIRC, the cores are 
>> grouped into (fairly large) bundles, each bundle runs a single kernel, 
>> but different bundles can run completely different kernels. So you 
>> don't need a ray queue with enough rays for the entire GPU, just for a 
>> whole bundle of cores.
> 
> True, I think the shaders are in blocks of 4, and you have to have 
> groups of 32 blocks running the same shader or something like that 
> (which would be 128 shaders per program).  I don't remember the exact 
> numbers, though, and in fact it's probably GPU dependent.

Yeah, I'm pretty sure the batch sizes vary by GPU model. But the main 
point is, you don't have to assign *all* cores to the same kernel; just 
sufficiently large bunches of them.

-- 
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*

Post a reply to this message

From: Chambers
Subject: Re: Suggestion: OpenCL
Date: 16 Aug 2009 11:33:22
Message: <4a8826c2$1@news.povray.org>

Orchid XP v8 wrote:
>> But those tests are also quite simple, so would benefit the least from 
>> the GPU.
> 
> How do you figure that?

Because you need to look at the ratio of computation time to 
transmission time.  The simpler something is, the lower the ratio, and 
the less worthwhile offloading is (after all, it messes with the cache, 
the bus, the memory system, etc).  Ideally, whatever you offload will be 
computationally intensive, so the transmission time is justifiable.

> Yeah, I'm pretty sure the batch sizes vary by GPU model. But the main 
> point is, you don't have to assign *all* cores to the same kernel; just 
> sufficiently large bunches of them.

The key, of course, being "sufficiently large bunches."  There's a 
strong likelihood that POV wouldn't be able to consistently create 
batches that are sufficiently large.

...Chambers

Post a reply to this message

From: Invisible
Subject: Re: Suggestion: OpenCL
Date: 17 Aug 2009 04:26:16
Message: <4a891428$1@news.povray.org>

Chambers wrote:

> The key, of course, being "sufficiently large bunches."  There's a 
> strong likelihood that POV wouldn't be able to consistently create 
> batches that are sufficiently large.

This is the key question, yes.

Post a reply to this message

<<< Previous 10 Messages

Goto Latest 10 Messages

Next 10 Messages >>>