 |
 |
|
 |
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Saul Luizaga schrieb:
>> 1) Recursion. As clipka (Christian?) wrote, it is absolutely
>> essential for POV.
>
> I suppose this is unsolvable without an C++ ported OpenCL.
???
Recursion is not a feature of C++, it is also a part of standard C99.
>> 2) Data parallelization versus code parallelization (this is related
>> to the first, but is not strictly the same).
>
> they say "an API for coordinating data and task-based parallel
> computation...", this doesn't help? If it could do both maybe would be
> of use for POV-Ray.
Did you actually /read/ the spec - or just the enthusiastic introduction?
Sure, it does support task-based parallel computation - why? Probably
because it also targets classic multi-core CPUs, which are ideally
suited to task-based parallel computing.
GPUs perform very poorly with task-based parallel computations, due to
their hardware architecture. A software abstraction layer won't change
that fundamental limitation.
> I see... maybe if GPGPUs are not use as co-processors but as an
> auxiliary co-processor that is called on demand, if GPU compliant
> procedure needs to be processed. Are absolutely sure there isn't a case
> where a GPU can help? maybe in the middle of a rendering/parsing?
No.
POV-Ray's internal workflow does not support asynchronous computations
(other than having multiple threads independently render parts of the
image), so only blocking "calls" to the GPU would be of any use, putting
the CPU task in waiting state in the meantime. Therefore, only portions
of the code that can be computed /significantly/ faster by the GPU, or
have any /significant/ size, would warrant "outsourcing" of
computations, otherwise parameter passing and task switching overhead
would bog down performance instead of increasing it.
But the only sections of POV-Ray code that do ask for parallelization
are RGB color and 3D vector computations, or similarly-sized problems;
these can be parallelized quite well on modern CPUs as well using SSE2
(i.e. the GPU will not be much faster), and are heavily intermixed with
conditional branching (i.e. the size of outsourceable work packages is
very small)
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Saul Luizaga schrieb:
>> Another possibility is to run the main renderer on the CPU, adding
>> rays to queues, and sending any "sufficiently large" queues to the GPU
>> for processing. I don't know if bandwidth limitations between the two
>> would make this viable...
>
> Exactly, that is why I asked: "Are absolutely sure there isn't a case
> where a GPU can help? maybe in the middle of a rendering/parsing?".
Note that although the approach /may/ (!) work, it is a /fundamentally/
different approach from what POV-Ray is doing.
Changing POV-Ray to use that approach would imply virtually a complete
rewrite of the render engine.
> As you can see, maybe bandwidth it isn't much of an issue since The
> transfer between the PCIe video card and the Main memory can me made at
> 5 GT/s. Is this still insufficient for POV-Ray peak performance?
So you're looking at peak data transfer rate limits and from them can
infer that transfer between CPU and GPU memory space is not an issue?
Did you consider latency issues, or the overhead created by the OpenCL
framework itself? How about the latency for a "function call"?
If your work packages are large enough, then these are no issues. But in
a raytracer, be prepared for rather small work packages.
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Saul Luizaga schrieb:
> So, you assume that is just a huge amount of hype, and even if it works
> for other apps., won't for POV-Ray. I think there are more to analyze
> than just simple generalizations.
I didn't see much analyzing on your side - only enthusiasm about what
they write in the introduction. It appears to me that you don't know
much about POV-Ray's architecture or GPU architecture, and are actually
not too familiar with the OpenCL framework either.
I'm not generalizing. I did have a look at the paper, and analyzed it
(and POV-Ray's program structure) in enough detail to present hard facts
why POV-Ray and OpenCL won't mix, at least not at the time being.
> Anyway, is a POV/TAG-Team decision, we can merely speculate.
No, we can say for sure that they will /not/ go for OpenCL anytime near.
>> I had actually and honestly hoped to discourage you with my initial
>> groaning. I guess the POV-Ray dev team is better informed about GPU
>> computing than you expect.
>
> And you know what I expect the POV/TAG-Team to know... you assume too
> much...
From your postings, it is quite obvious that you assumed OpenCL to be
something brand new, with you being the first in all the POV-Ray
community to discover it.
FYI: OpenCL has been mentioned in the POV-Ray NGs half a year ago already.
> groaning is an emotional response and as such, irrational: I haven't
> been reading this NG for a long time and the first line of text I read
> was your groaning, besides of the rudeness, it explains nothing, and
And that's exactly why I issued this loud groan: You /obviously/ didn't
even bother to check whether anyone ever brought up the topic of GPU
usage any time recently, or you would have noticed that it had just been
on the agenda again, in the incarnation of CUDA.
Even after you were informed that the topic had been discussed just
recently, you also obviously haven't bothered to check the thread on
that topic, to get yourself updated on the problems of using GPUs with
POV-Ray, and compare how that applies to OpenCL.
You just came stumbling in, hollering "hey, I got this great perfect
idea for POV-Ray...", as if the people developing POV-Ray would never
have heard of it otherwise. Even upon receiving a not-so-positive
reaction indicating that you /might/ be wrong, you stuck to that
attitude, hollering right away again in p.programming in the same style.
There's rudeness in that, too.
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Saul Luizaga schrieb:
> Well instead of groaning you can make a small .txt file in your PC:
> "Alrady discussed, conclusions were:
> 1)....
> 2)...
> 3)..."
> or something like that, to avoid frustration and redundancy.
What, and have it clutter my hard disk, and three months later when it's
really needed I don't remember where I put it?
Furthermore, as is just being demonstrated, people don't take the
explanations and leave it be, but start discussions about it. Or they
don't understand the explanations and ask. Or...
> I don't think my ideas are revolutionary, nor new, nor ingenious, I'm
> just suggesting something that MAY or MAY NOT have not been discussed
> before.
Next time you're in that situation, you might as well /check/ if it has
been discussed before.
> Also I assume everyone here knows more than me, including the
> POV/TAG-Team, so this is more of a hint than a suggestion. Sometimes
> smart people forget about simple things.
Reread your original post. It doesn't sound like what you describe now.
> I see, I know POV-Ray source code is HUGE and any minor changes
> represent big efforts.
Not really. Major changes represent major efforts. As of now, GPU
support would be a more-than-major effort.
/Some/ changes that look like minor ones do require big efforts due to
how the internal structure happens to be. Others appear like a major one
but are actually a piece of cake to implement.
> Maybe there is a use for it, not as another main processor but as
> secondary one. I posted about it in another post.
Maybe it's worth reconsidering in a year or so.
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
clipka wrote:
> If your work packages are large enough, then these are no issues. But in
> a raytracer, be prepared for rather small work packages.
And that's another problem. While a GPU has *hundreds* of cores that
all work in parallel, in raytracing you're unlikely to encounter more
than a few dozen rays which require the same test. It's just not
efficient to solve this problem with GPUs.
Note that, while GPGPU has come a long way in recent years, the only way
to make it more useful for POV is to make it more like a CPU and less
like a GPU. While GPUs in general may go this route in the future (Tim
Sweeney thinks so), really GPUs will focus first and foremost on what
games need... and games don't need raytracing yet.
...Chambers
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Saul Luizaga wrote:
> So, you assume that is just a huge amount of hype, and even if it works
> for other apps., won't for POV-Ray.
NB: There are some apps, which theoretically *should* benefit massively
from GPGPU, which are actually hurt by it.
Case in point: TMPGEnc, using CUDA. Video encoding has long been touted
as one of the best areas for improvement via GPGPU.
On a GeForce 260, it's actually *slower* than a Core 2 Quad 6600 (2.4gHz).
Imagine a dual-processor six-core Xeon system (12 cores, 24 threads) and
you'd need a few dozen GPUs (running literally *thousands* of threads)
to match the performance.
...Chambers
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
clipka wrote:
> Note that although the approach /may/ (!) work, it is a /fundamentally/
> different approach from what POV-Ray is doing.
>
> Changing POV-Ray to use that approach would imply virtually a complete
> rewrite of the render engine.
Agreed. It's no small task, by any stretch of the imagination...
--
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
>> If your work packages are large enough, then these are no issues. But
>> in a raytracer, be prepared for rather small work packages.
>
> And that's another problem. While a GPU has *hundreds* of cores that
> all work in parallel, in raytracing you're unlikely to encounter more
> than a few dozen rays which require the same test. It's just not
> efficient to solve this problem with GPUs.
I'm not sure I agree with these statements.
Tracing a scene involves shooting thousands, or even tens of thousands
of rays at the top-level geometry. While there may be some objects which
are quite small, most objects will be hit by hundreds and hundreds of rays.
Now, whether it's efficient to run the entire rendering engine on the
GPU is debatable, but it would seem to be that it should be possible to
achieve *some* speedup by running at least part of the tracing process
on the GPU. (The only question is bandwidth.)
> Note that, while GPGPU has come a long way in recent years, the only way
> to make it more useful for POV is to make it more like a CPU and less
> like a GPU. While GPUs in general may go this route in the future (Tim
> Sweeney thinks so), really GPUs will focus first and foremost on what
> games need... and games don't need raytracing yet.
More to the point, if a GPU was more like a CPU, it wouldn't be faster
than a CPU.
--
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Orchid XP v8 wrote:
>>> If your work packages are large enough, then these are no issues. But
>>> in a raytracer, be prepared for rather small work packages.
>>
>> And that's another problem. While a GPU has *hundreds* of cores that
>> all work in parallel, in raytracing you're unlikely to encounter more
>> than a few dozen rays which require the same test. It's just not
>> efficient to solve this problem with GPUs.
>
> I'm not sure I agree with these statements.
>
> Tracing a scene involves shooting thousands, or even tens of thousands
> of rays at the top-level geometry. While there may be some objects which
> are quite small, most objects will be hit by hundreds and hundreds of rays.
Let's say a scene rendered for display on a 1080p TV has approximately
300 objects in it, each one covering a similar portion of the display
(for simplicity's sake).
That's a total of 1,920x1,080=2,073,600 pixels, or 6,912 for each
object. Admittedly, this is more than my statement of "a few dozen" :)
However, each level of recursion will split the groups rays - let's
say into 4 groups (I honestly think they would be grouped in more,
smaller, groups, but this will make the math easier). After only 4
levels of recursion (a value entirely common in scenes) the rays are
bundled in groups of 25 (assuming an even distribution, of course).
With a smaller image, more objects, and less tight grouping, you'd get
such small groups after only one or two levels of recursion.
> More to the point, if a GPU was more like a CPU, it wouldn't be faster
> than a CPU.
True ;)
...Chambers
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
> After only 4 levels of
> recursion (a value entirely common in scenes) the rays are bundled in
> groups of 25 (assuming an even distribution, of course).
>
> With a smaller image, more objects, and less tight grouping, you'd get
> such small groups after only one or two levels of recursion.
That seems small enough that it might be worth transferring back to the CPU.
But if you think about common scenes where you have walls and floors and
so forth, it might be worth using the GPU to test ray intersections
against these, and against bounding volumes (if you're using them). Huge
numers of rays need to be run through these tests, so the GPU can fire
those off quite quickly. It might also be worth running the "so what the
hell is the colour of this surface?" calculation on the GPU - what with
texturing, normal maps, multiple light rays of different angles and
colours, etc.
Also, let's clear this up: My understanding is that the GPU does not
require *all* cores to run an identical kernel. IIRC, the cores are
grouped into (fairly large) bundles, each bundle runs a single kernel,
but different bundles can run completely different kernels. So you don't
need a ray queue with enough rays for the entire GPU, just for a whole
bundle of cores.
--
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|
 |