 |
 |
|
 |
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Saul Luizaga wrote:
> You just don't like to be corrected
This sounds more like you.
Believe it or not, this subject is an official Dead Horse. Please stop
beating it, you'll only get a bloody mess.
> I made a search on p.o-t on the subject
First, p.o-t is the wrong forum.
Second, posts here expire.
> was: "are you ABSOLUTELY SURE it won't work...?"
No, you didn't even ask a question. You made a statement: "I have just
learned about it and I think this could be a great improvement for POV-ray"
This is something which, by your own words, you have just heard about.
You then quoted a whole bunch of marketing blurbs about it.
Trust me. We have not just heard about it, but have been looking at the
idea for more than *10 years* now. I haven't specifically looked at
OpenCL, but I have CUDA, and the underlying hardware hasn't changed
enough yet to overcome the inherent limitations in the system.
> To the subject: I don't know much about GPUs, POV-Ray internals,
> programming, nor OpenCL,
Then why is it so hard for you to accept that this isn't a good idea at
this time?
If you really want to help make POV-Ray better, the source code is
available. Please, dig in! Start reading the p.p group, and look at
some of the outstanding bugs! That would surely be appreciated more
than your parroting some random marketing blurbs.
...Chambers
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Chambers wrote:
> Saul Luizaga wrote:
>> You just don't like to be corrected
>
> This sounds more like you.
right...
>
> Believe it or not, this subject is an official Dead Horse. Please stop
> beating it, you'll only get a bloody mess.
>
>> I made a search on p.o-t on the subject
>
> First, p.o-t is the wrong forum.
> Second, posts here expire.
>
>> was: "are you ABSOLUTELY SURE it won't work...?"
>
> No, you didn't even ask a question.
news://news.povray.org:119/4a850da7@news.povray.org, check the last
paragraph, proves you wrong. Did you liked to be corrected?
>
> This is something which, by your own words, you have just heard about.
> You then quoted a whole bunch of marketing blurbs about it.
>
> Trust me. We have not just heard about it, but have been looking at the
> idea for more than *10 years* now. I haven't specifically looked at
> OpenCL, but I have CUDA, and the underlying hardware hasn't changed
> enough yet to overcome the inherent limitations in the system.
>
>> To the subject: I don't know much about GPUs, POV-Ray internals,
>> programming, nor OpenCL,
>
> Then why is it so hard for you to accept that this isn't a good idea at
> this time?
>
> If you really want to help make POV-Ray better, the source code is
> available. Please, dig in! Start reading the p.p group, and look at
> some of the outstanding bugs! That would surely be appreciated more
> than your parroting some random marketing blurbs.
>
> ....Chambers
So open minded... thanks, I out of here for good.
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Saul Luizaga wrote:
>> No, you didn't even ask a question.
>
> news://news.povray.org:119/4a850da7@news.povray.org, check the last
> paragraph, proves you wrong. Did you liked to be corrected?
Yes, I do as a matter of fact. I don't like being wrong.
...Chambers
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Saul Luizaga schrieb:
> I made a search on p.o-t on the subject and found nothing.
This may be because p.o-t is actually not the best place for such a
posting, as it is intended for topics /not/ related to POV-Ray.
> The question
> was: "are you ABSOLUTELY SURE it won't work...?" First you say NO. then
> MAY (!) work. Obvious contradiction.
You're quoting me out of context, and out of chronological order.
You proposed OpenCL for POV-Ray. If you go back to my first reply,
you'll find that my answer was manifold back then already - but not
contradictory (nor were my follow-ups):
(I) *NO*, OpenCL will *not* be an option for POV-Ray, because (a) it is
written in a different language (namely C++), and (b) the internal
architecture makes heavy use of constructs not available in OpenCL (most
particularly recursions).
Both of this still stands. If you have any substantial information that
make both of these arguments moot (other than wishful thinking about
where the OpenCL project /may/ be heading some time in the future
according to your imagination), then go ahead and correct me.
There is another point in which I might be mistaken: My argument is
based on assumptions that (a) POV-Ray will not go back to C again, and
(b) the fundamental architecture will not change anytime soon. However,
given that POV-Ray has /just been/ ported from C to C++, and the feat of
adding SMP support has /already/ taken something like 3 years, I don't
expect any strong urge to introduce even more fundamental changes in the
underlying architecture.
This urge /might/ be greater if there was a clear promise of benefit.
However, even /that/ is still doubtful.
(II) Yes, it *MIGHT* be possible to run a full-fledged independent
POV-Ray thread on a GPU; however, this would require some conditions to
be fulfilled first:
(1) POV-Ray would /first/ have to be brought to the point where it can
run distributed renders on multiple machines.
The reason here is that in order to run a separate thread on the GPU,
POV-Ray would need to be enabled to run render threads in fully separate
(or at least non-synchronized) address spaces. As this is also /the/
main prerequisite for distrubuted rendering on multiple machines (in
this multiprocessing model, communication bandwidth and latency are
likely to pose not so much of a problem even in the network, let alone
between CPU and GPU), and network rendering is already a goal POV-Ray is
committed to, I do /not/ expect GPU support to be tackled any time
earler: Network rendering comes with a clear promise of a fair gain in
render speed and usability, while the benefit of GPU rendering is still
carrying a big question mark.
(2) Someone would have to come up with a GPU programming framework much
better suited to POV-Ray, which eliminates all the issues in (I), i.e.
supports C++ and recursions.
(III) With the intention of elaborating on (I), I noted that yes, it
*MIGHT* be possible to include GPU support via OpenCL use in POV-Ray
even without changing the fundamental structure of the application,
*BUT* this would /very/ likely lead to worse performance than without
the GPU support, due to (a) the inability of POV-Ray to supply the GPU
with a large enough and parallelizable work package, and (b) the
inability of POV-Ray to continue while the GPU is processing that work
package.
As a resume, I still am perfectly sure that OpenCL is *not* currently an
option to speed up POV-Ray.
Plus, I am perfectly sure that OpenCL will be further revisited now and
then, and reconsidered as an option, should it be improved to the point
where it appears to make some sense.
> To the subject: I don't know much about GPUs, POV-Ray internals,
> programming, nor OpenCL, but I do know a little about Computer
> electronic architecture. Some of the questions I made are not meant for
> this NG readers not even the TAG-Team but to the POV-Team, so they can
> made calculations about OpenCL.
There's not much that can be calculated from your figures, except the
peak memory transfer rate of some - presumably - representative graphics
hardware. Program execution speed, however, also depends on the
particular compiler (and optimizer!) - which are not even available yet.
> I think they may have a structured and classified diagram of ALL POV-Ray
> programming structures, knowing approximately how much each structure
> would take in the CPU, so from the data I provided they could see:
That would help in no way, because these structures all rely on an
architecture that, as already pointed out, is fundamentally incompatible
with OpenCL-based GPU computing.
> 3) And this should be first, see if it is worth it, to consider the GPU
> aid, if not possible through OpenCL, make a custom framework. If it is,
> calculate how many GPU capable structures there are to have a second
> evaluation, and see if that number is worth the effort.
I don't think that the POV-Ray dev team should even be bothered with
this, because from all I see, current GPU architectures would appear to
either require a /monstrous/ amount of framework development to make it
work for /POV-Ray/ (again, most particularly due to the lack of support
for recursion, which apparently is a hardware limitation, not a
framework thing), or provide /very small/ gain or even a disadvantage
(due to the small size of work packages POV-Ray could supply to the GPU,
and the lack of POV-Ray to do anything else while waiting for the
results), or require a /monstrous/ amount of POV-Ray architecture changes.
I don't think the POV-Ray dev team /want/ to be bothered (once again)
with this anyway.
And even /if/ they wanted to be bothered, /I/ as a POV-Ray user wouldn't
want them to, because I think there are plenty of features which are
more important to invest time into, where a gain is clearly visible,
while the gain of GPU support is still /very/ questionable, for all the
reasons I (and others) have presented by now.
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Saul Luizaga schrieb:
>>> was: "are you ABSOLUTELY SURE it won't work...?"
>>
>> No, you didn't even ask a question.
>
> news://news.povray.org:119/4a850da7@news.povray.org, check the last
> paragraph, proves you wrong. Did you liked to be corrected?
Are you actually aware that Chambers and CLipka are two totally
different persons? That there are actually /two/ persons trying to
convince you that OpenCL will get POV-Ray nowhere (with others having
joined in occasionally)?
Maybe you should re-read all the replies you got, and have a look at who
actually wrote them, and what you replied to whom.
And Chambers is right: There wasn't a question to start with. You only
went for a more questioning style some postings into the discussion.
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
>> But if you think about common scenes where you have walls and floors
>> and so forth, it might be worth using the GPU to test ray
>> intersections against these, and against bounding volumes (if you're
>> using them). Huge numers of rays need to be run through these tests,
>> so the GPU can fire those off quite quickly.
>
> But those tests are also quite simple, so would benefit the least from
> the GPU.
How do you figure that?
The benefit of the GPU isn't the complexity of the kernels it executes
(usually quite the converse), it's more the fact that it can do the same
operations many, many times over simultaneously.
> If isosurfaces could be translated efficiently into shaders,
> then those would show the most benefit (and Julia fractals, of course).
No arguments here.
>> It might also be worth running the "so what the hell is the colour of
>> this surface?" calculation on the GPU - what with texturing, normal
>> maps, multiple light rays of different angles and colours, etc.
>
> Possibly. In fact, using the GPU for GI might be an option.
Yes, there's a couple of ways you might implement this, with varying
degrees of accuracy. For example, you could feed the GPU a polygon mesh
approximating the scene. (There are already known algorithms for doing
GI on a polygon mesh.) Or you could just use the GPU to accelerate the
insane number of ray intersection tests required during GI.
>> Also, let's clear this up: My understanding is that the GPU does not
>> require *all* cores to run an identical kernel. IIRC, the cores are
>> grouped into (fairly large) bundles, each bundle runs a single kernel,
>> but different bundles can run completely different kernels. So you
>> don't need a ray queue with enough rays for the entire GPU, just for a
>> whole bundle of cores.
>
> True, I think the shaders are in blocks of 4, and you have to have
> groups of 32 blocks running the same shader or something like that
> (which would be 128 shaders per program). I don't remember the exact
> numbers, though, and in fact it's probably GPU dependent.
Yeah, I'm pretty sure the batch sizes vary by GPU model. But the main
point is, you don't have to assign *all* cores to the same kernel; just
sufficiently large bunches of them.
--
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Orchid XP v8 wrote:
>> But those tests are also quite simple, so would benefit the least from
>> the GPU.
>
> How do you figure that?
Because you need to look at the ratio of computation time to
transmission time. The simpler something is, the lower the ratio, and
the less worthwhile offloading is (after all, it messes with the cache,
the bus, the memory system, etc). Ideally, whatever you offload will be
computationally intensive, so the transmission time is justifiable.
> Yeah, I'm pretty sure the batch sizes vary by GPU model. But the main
> point is, you don't have to assign *all* cores to the same kernel; just
> sufficiently large bunches of them.
The key, of course, being "sufficiently large bunches." There's a
strong likelihood that POV wouldn't be able to consistently create
batches that are sufficiently large.
...Chambers
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Chambers wrote:
> The key, of course, being "sufficiently large bunches." There's a
> strong likelihood that POV wouldn't be able to consistently create
> batches that are sufficiently large.
This is the key question, yes.
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
> 1) Recursion. As clipka (Christian?) wrote, it is absolutely essential
> for POV.
>
> 2) Data parallelization versus code parallelization (this is related to
> the first, but is not strictly the same).
>
> The ray tracing algorithm follows drastically different code branches on a
> single set of data, based on recursion (reflections & refractions), as
> well as the other various computations needed (texture calculation, light
> source occlusion, etc) which almost all need access to the entire scene.
Often you need to drastically change your algorithm to get it running well
on a GPU. It's quite trivial to rewrite the raytracing algorithm to work on
masses of rays in parallel with no branching or ray spawning, and then do
the branching/spawning for the rays all in one go. After this you can
efficiently use the GPU and CPU in parallel.
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
> The solution in both cases is to put rays into "queues", such that all the
> rays in a given queue take the same code path [for a while]. When you need
> to spawn a secondary ray, you add it to a queue rather than recursively
> tracing it. When some rays hit an object and others don't, you add them to
> different queues.
Why add it to a different queue? All rays that still need to be traced
(whether they are reflection, refraction, primary or shadow rays) simply
need to be intersected with the scene geometry. If you like you could keep
a flag in the queue for each ray to say what type it is, but the GPU part
that does the ray-scene intersection probably doesn't care what sort of ray
it is.
In that way the GPU will just process as many rays as it can in batches
until the scene is complete. The CPU would handle preparing the queue at
each step, ie removing rays that have terminated and inserting new rays for
shadows and reflections etc.
Of course to be able to do efficient scene intersection on the GPU it would
probably be best to only allow triangle-based scenes, and figure out some
way to store a Kd-tree of triangles efficiently in a texture. Sounds like
the sort of thing someone has already done a PhD on :-)
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|
 |