POV-Ray: Newsgroups: povray.off-topic: Suggestion: OpenCL

POV-Ray : Newsgroups : povray.off-topic : Suggestion: OpenCL		Server Time 25 Oct 2025 09:12:15 EDT (-0400)

<<< Previous 10 Messages

Goto Latest 10 Messages

Next 10 Messages >>>

From: clipka
Subject: Re: Suggestion: OpenCL
Date: 14 Aug 2009 20:59:48
Message: <4a860884@news.povray.org>

Saul Luizaga schrieb:
> So, you assume that is just a huge amount of hype, and even if it works 
> for other apps., won't for POV-Ray. I think there are more to analyze 
> than just simple generalizations.

I didn't see much analyzing on your side - only enthusiasm about what 
they write in the introduction. It appears to me that you don't know 
much about POV-Ray's architecture or GPU architecture, and are actually 
not too familiar with the OpenCL framework either.

I'm not generalizing. I did have a look at the paper, and analyzed it 
(and POV-Ray's program structure) in enough detail to present hard facts 
why POV-Ray and OpenCL won't mix, at least not at the time being.


 > Anyway, is a POV/TAG-Team decision, we can merely speculate.

No, we can say for sure that they will /not/ go for OpenCL anytime near.


>> I had actually and honestly hoped to discourage you with my initial 
>> groaning. I guess the POV-Ray dev team is better informed about GPU 
>> computing than you expect.
> 
> And you know what I expect the POV/TAG-Team to know... you assume too 
> much...

 From your postings, it is quite obvious that you assumed OpenCL to be 
something brand new, with you being the first in all the POV-Ray 
community to discover it.

FYI: OpenCL has been mentioned in the POV-Ray NGs half a year ago already.


> groaning is an emotional response and as such, irrational: I haven't 
> been  reading this NG for a long time and the first line of text I read 
> was your groaning, besides of the rudeness, it explains nothing, and 

And that's exactly why I issued this loud groan: You /obviously/ didn't 
even bother to check whether anyone ever brought up the topic of GPU 
usage any time recently, or you would have noticed that it had just been 
on the agenda again, in the incarnation of CUDA.

Even after you were informed that the topic had been discussed just 
recently, you also obviously haven't bothered to check the thread on 
that topic, to get yourself updated on the problems of using GPUs with 
POV-Ray, and compare how that applies to OpenCL.

You just came stumbling in, hollering "hey, I got this great perfect 
idea for POV-Ray...", as if the people developing POV-Ray would never 
have heard of it otherwise. Even upon receiving a not-so-positive 
reaction indicating that you /might/ be wrong, you stuck to that 
attitude, hollering right away again in p.programming in the same style. 
There's rudeness in that, too.

Post a reply to this message

From: clipka
Subject: Re: Suggestion: OpenCL
Date: 14 Aug 2009 21:13:35
Message: <4a860bbf$1@news.povray.org>

Saul Luizaga schrieb:
> Well instead of groaning you can make a small .txt file in your PC: 
> "Alrady discussed, conclusions were:
> 1)....
> 2)...
> 3)..."
> or something like that, to avoid frustration and redundancy.

What, and have it clutter my hard disk, and three months later when it's 
really needed I don't remember where I put it?

Furthermore, as is just being demonstrated, people don't take the 
explanations and leave it be, but start discussions about it. Or they 
don't understand the explanations and ask. Or...


> I don't think my ideas are revolutionary, nor new, nor ingenious, I'm 
> just suggesting something that MAY or MAY NOT have not been discussed 
> before.

Next time you're in that situation, you might as well /check/ if it has 
been discussed before.

> Also I assume everyone here knows more than me, including the 
> POV/TAG-Team, so this is more of a hint than a suggestion. Sometimes 
> smart people forget about simple things.

Reread your original post. It doesn't sound like what you describe now.


> I see, I know POV-Ray source code is HUGE and any minor changes 
> represent big efforts.

Not really. Major changes represent major efforts. As of now, GPU 
support would be a more-than-major effort.

/Some/ changes that look like minor ones do require big efforts due to 
how the internal structure happens to be. Others appear like a major one 
  but are actually a piece of cake to implement.


> Maybe there is a use for it, not as another main processor but as 
> secondary one. I posted about it in another post.

Maybe it's worth reconsidering in a year or so.

Post a reply to this message

From: Chambers
Subject: Re: Suggestion: OpenCL
Date: 14 Aug 2009 23:57:45
Message: <4a863239$1@news.povray.org>

clipka wrote:
> If your work packages are large enough, then these are no issues. But in 
> a raytracer, be prepared for rather small work packages.

And that's another problem.  While a GPU has *hundreds* of cores that 
all work in parallel, in raytracing you're unlikely to encounter more 
than a few dozen rays which require the same test.  It's just not 
efficient to solve this problem with GPUs.

Note that, while GPGPU has come a long way in recent years, the only way 
to make it more useful for POV is to make it more like a CPU and less 
like a GPU.  While GPUs in general may go this route in the future (Tim 
Sweeney thinks so), really GPUs will focus first and foremost on what 
games need... and games don't need raytracing yet.

...Chambers

Post a reply to this message

From: Chambers
Subject: Re: Suggestion: OpenCL
Date: 15 Aug 2009 00:01:59
Message: <4a863337$1@news.povray.org>

Saul Luizaga wrote:
> So, you assume that is just a huge amount of hype, and even if it works 
> for other apps., won't for POV-Ray.

NB: There are some apps, which theoretically *should* benefit massively 
from GPGPU, which are actually hurt by it.

Case in point: TMPGEnc, using CUDA.  Video encoding has long been touted 
as one of the best areas for improvement via GPGPU.

On a GeForce 260, it's actually *slower* than a Core 2 Quad 6600 (2.4gHz).

Imagine a dual-processor six-core Xeon system (12 cores, 24 threads) and 
you'd need a few dozen GPUs (running literally *thousands* of threads) 
to match the performance.

...Chambers

Post a reply to this message

From: Orchid XP v8
Subject: Re: Suggestion: OpenCL
Date: 15 Aug 2009 02:07:42
Message: <4a8650ae@news.povray.org>

clipka wrote:

> Note that although the approach /may/ (!) work, it is a /fundamentally/ 
> different approach from what POV-Ray is doing.
> 
> Changing POV-Ray to use that approach would imply virtually a complete 
> rewrite of the render engine.

Agreed. It's no small task, by any stretch of the imagination...

-- 
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*

Post a reply to this message

From: Orchid XP v8
Subject: Re: Suggestion: OpenCL
Date: 15 Aug 2009 02:10:58
Message: <4a865172@news.povray.org>

>> If your work packages are large enough, then these are no issues. But 
>> in a raytracer, be prepared for rather small work packages.
> 
> And that's another problem.  While a GPU has *hundreds* of cores that 
> all work in parallel, in raytracing you're unlikely to encounter more 
> than a few dozen rays which require the same test.  It's just not 
> efficient to solve this problem with GPUs.

I'm not sure I agree with these statements.

Tracing a scene involves shooting thousands, or even tens of thousands 
of rays at the top-level geometry. While there may be some objects which 
are quite small, most objects will be hit by hundreds and hundreds of rays.

Now, whether it's efficient to run the entire rendering engine on the 
GPU is debatable, but it would seem to be that it should be possible to 
achieve *some* speedup by running at least part of the tracing process 
on the GPU. (The only question is bandwidth.)

> Note that, while GPGPU has come a long way in recent years, the only way 
> to make it more useful for POV is to make it more like a CPU and less 
> like a GPU.  While GPUs in general may go this route in the future (Tim 
> Sweeney thinks so), really GPUs will focus first and foremost on what 
> games need... and games don't need raytracing yet.

More to the point, if a GPU was more like a CPU, it wouldn't be faster 
than a CPU.

-- 
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*

Post a reply to this message

From: Chambers
Subject: Re: Suggestion: OpenCL
Date: 15 Aug 2009 04:03:15
Message: <4a866bc3$1@news.povray.org>

Orchid XP v8 wrote:
>>> If your work packages are large enough, then these are no issues. But 
>>> in a raytracer, be prepared for rather small work packages.
>>
>> And that's another problem.  While a GPU has *hundreds* of cores that 
>> all work in parallel, in raytracing you're unlikely to encounter more 
>> than a few dozen rays which require the same test.  It's just not 
>> efficient to solve this problem with GPUs.
> 
> I'm not sure I agree with these statements.
> 
> Tracing a scene involves shooting thousands, or even tens of thousands 
> of rays at the top-level geometry. While there may be some objects which 
> are quite small, most objects will be hit by hundreds and hundreds of rays.

Let's say a scene rendered for display on a 1080p TV has approximately 
300 objects in it, each one covering a similar portion of the display 
(for simplicity's sake).

That's a total of 1,920x1,080=2,073,600 pixels, or 6,912 for each 
object.  Admittedly, this is more than my statement of "a few dozen" :) 
  However, each level of recursion will split the groups rays - let's 
say into 4 groups (I honestly think they would be grouped in more, 
smaller, groups, but this will make the math easier).  After only 4 
levels of recursion (a value entirely common in scenes) the rays are 
bundled in groups of 25 (assuming an even distribution, of course).

With a smaller image, more objects, and less tight grouping, you'd get 
such small groups after only one or two levels of recursion.

> More to the point, if a GPU was more like a CPU, it wouldn't be faster 
> than a CPU.

True ;)

...Chambers

Post a reply to this message

From: Orchid XP v8
Subject: Re: Suggestion: OpenCL
Date: 15 Aug 2009 04:16:09
Message: <4a866ec9$1@news.povray.org>

> After only 4 levels of 
> recursion (a value entirely common in scenes) the rays are bundled in 
> groups of 25 (assuming an even distribution, of course).
> 
> With a smaller image, more objects, and less tight grouping, you'd get 
> such small groups after only one or two levels of recursion.

That seems small enough that it might be worth transferring back to the CPU.

But if you think about common scenes where you have walls and floors and 
so forth, it might be worth using the GPU to test ray intersections 
against these, and against bounding volumes (if you're using them). Huge 
numers of rays need to be run through these tests, so the GPU can fire 
those off quite quickly. It might also be worth running the "so what the 
hell is the colour of this surface?" calculation on the GPU - what with 
texturing, normal maps, multiple light rays of different angles and 
colours, etc.

Also, let's clear this up: My understanding is that the GPU does not 
require *all* cores to run an identical kernel. IIRC, the cores are 
grouped into (fairly large) bundles, each bundle runs a single kernel, 
but different bundles can run completely different kernels. So you don't 
need a ray queue with enough rays for the entire GPU, just for a whole 
bundle of cores.

-- 
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*

Post a reply to this message

From: Chambers
Subject: Re: Suggestion: OpenCL
Date: 15 Aug 2009 12:49:01
Message: <4a86e6fd$1@news.povray.org>

Orchid XP v8 wrote:
> But if you think about common scenes where you have walls and floors and 
> so forth, it might be worth using the GPU to test ray intersections 
> against these, and against bounding volumes (if you're using them). Huge 
> numers of rays need to be run through these tests, so the GPU can fire 
> those off quite quickly.

But those tests are also quite simple, so would benefit the least from 
the GPU.  If isosurfaces could be translated efficiently into shaders, 
then those would show the most benefit (and Julia fractals, of course).

> It might also be worth running the "so what the 
> hell is the colour of this surface?" calculation on the GPU - what with 
> texturing, normal maps, multiple light rays of different angles and 
> colours, etc.

Possibly.  In fact, using the GPU for GI might be an option.  For 
instance, we could run a single extremely high resolution pass with no 
lighting or texturing, just intersections, and cache all the 
intersection locations.  Then, feed these intersections to the GPU, and 
have it calculate lighting for those points.  Feed the lighting data 
back to the CPU for radiosity, and voila!  Fast GI :)  (Of course, 
implementing it would be a b*tch, but that's beside the point).

> Also, let's clear this up: My understanding is that the GPU does not 
> require *all* cores to run an identical kernel. IIRC, the cores are 
> grouped into (fairly large) bundles, each bundle runs a single kernel, 
> but different bundles can run completely different kernels. So you don't 
> need a ray queue with enough rays for the entire GPU, just for a whole 
> bundle of cores.

True, I think the shaders are in blocks of 4, and you have to have 
groups of 32 blocks running the same shader or something like that 
(which would be 128 shaders per program).  I don't remember the exact 
numbers, though, and in fact it's probably GPU dependent.

...Chambers

Post a reply to this message

From: Saul Luizaga
Subject: clipka I'll answer you here...
Date: 15 Aug 2009 13:07:11
Message: <4a86eb3f$1@news.povray.org>

clipka wrote:
> Saul Luizaga schrieb:
>>> Another possibility is to run the main renderer on the CPU, adding 
>>> rays to queues, and sending any "sufficiently large" queues to the 
>>> GPU for processing. I don't know if bandwidth limitations between the 
>>> two would make this viable...
>>
>> Exactly, that is why I asked: "Are absolutely sure there isn't a case 
>> where a GPU can help? maybe in the middle of a rendering/parsing?".
> 
> Note that although the approach /may/ (!) work, it is a /fundamentally/ 
> different approach from what POV-Ray is doing.
> 
> Changing POV-Ray to use that approach would imply virtually a complete 
> rewrite of the render engine.
> 
>> As you can see, maybe bandwidth it isn't much of an issue since The 
>> transfer between the PCIe video card and the Main memory can me made 
>> at 5 GT/s. Is this still insufficient for POV-Ray peak performance?
> 
> So you're looking at peak data transfer rate limits and from them can 
> infer that transfer between CPU and GPU memory space is not an issue?
> 
> Did you consider latency issues, or the overhead created by the OpenCL 
> framework itself? How about the latency for a "function call"?
> 
> If your work packages are large enough, then these are no issues. But in 
> a raytracer, be prepared for rather small work packages.

You just don't like to be corrected due to your intellectual vanity and 
arrogance, leave those aside for good, they don't do anybody any good.

I made a search on p.o-t on the subject and found nothing. The question 
was: "are you ABSOLUTELY SURE it won't work...?" First you say NO. then 
MAY (!) work. Obvious contradiction. Then you made a number of 
assumptions (againg) about me, all of them WRONG and this is for sure 
rude. Posting on p.p was only the right thing to do, since it was a 
suggestion I didn't started a discussion there, but here.

On my first post in this thread: get some perspective dude, it was for 
everyone, from little tech knowledge to the most advanced, of course 
most advanced readers can obvious the explanation/clarification I made 
about OpenCL.

To the subject: I don't know much about GPUs, POV-Ray internals, 
programming, nor OpenCL, but I do know a little about Computer 
electronic architecture. Some of the questions I made are not meant for 
this NG readers not even the TAG-Team but to the POV-Team, so they can 
made calculations about OpenCL.

I think they may have a structured and classified diagram of ALL POV-Ray 
programming structures, knowing approximately how much each structure 
would take in the CPU, so from the data I provided they could see:

1) What GPU capable structures.
2) The amount of time that the selected GPU capable structures could 
take to do a given process.
3) And this should be first, see if it is worth it, to consider the GPU 
aid, if not possible through OpenCL, make a custom framework. If it is, 
calculate how many GPU capable structures there are to have a second 
evaluation, and see if that number is worth the effort.
4) ETC.

So I'm analyzing to the best of my abilities as everyone here, including 
you, but you're taking this too emotional and going for personal attacks 
on me trying to discredit me. I'm just trying to see all possibilities 
of GPU acceleration and have a discussion and maybe lighten up the 
analysis a little for the POV-Team, but maybe is not me place nor even 
non POV-TAG-Team people, so maybe I should just had this addressed to 
the POV-Team for their consideration only and avoid so much "bothering" 
people here. But as you can see there are somethings to say about it 
still...

Post a reply to this message

<<< Previous 10 Messages

Goto Latest 10 Messages

Next 10 Messages >>>