POV-Ray: Newsgroups: povray.off-topic: WebGL: Re: WebGL

POV-Ray : Newsgroups : povray.off-topic : WebGL : Re: WebGL		Server Time 5 Nov 2025 20:05:54 EST (-0500)

From: Orchid Win7 v1
Date: 21 Jun 2016 16:10:50
Message: <57699f4a$1@news.povray.org>

On 21/06/2016 11:58 AM, scott wrote:
>>> If it doesn't work on your GPU try reducing the number of SAMPLES (at
>>> the top of BufferA). The end result will be the same, it will just look
>>> a bit worse whilst rotating the camera.
>>
>> You'll be unsurprised to hear that this breaks Opera.
>
> What GPU do you have? Have you tried this Chrome or IE?

Apparently Opera is "widely known" for having rubbish WebGL support.

>> Looking at some of the stuff people have done, you wonder why this
>> amazing tech isn't in games...
>>
>> ...and then you realise it doesn't scale to non-trivial geometry. I'm
>> still figuring out how the GPU actually works, but it *appears* that it
>> works by executing all possible code paths, and just turning off the
>> cores that don't take that branch.
>
> Yes, that's how I understand it too. Writing this:
>
> if(some_condition)
> DoA();
> else
> DoB();
>
> Runs the same speed as this:
>
> DoA();
> DoB();
>
> For "small" scenes though, this is still orders of magnitude faster than
> it would ever run on a CPU.

The CPU may be superscalar, but having 4-vector float arithmetic in 
hardware, in parallel, on a bazillion cores has *got* to be faster. ;-)

>> That's find for 4 trivial primitives;
>> I'm going to say it doesn't scale to hundreds of billions of objects.
>>
>> Pity. It would be so cool...
>
> I suspect if you started to modify and optimise the hardware to cope
> better with more dynamic branching and recursion etc, you would end up
> back with a CPU :-)

I don't know. I think the main thing about the GPU is that it's SIMD. My 
CPU has 4 cores; my GPU has nearer 400. I gather that each individual 
core is actually slightly *slower* than a CPU core - it's just that 
there's a hell of a lot of them. Also that memory access patterns are 
very predictable (until you do complex texture lookups), which enables 
the memory scheduling to have massive bandwidth with all the latency 
hidden away, so you have no pipeline stalls or cache misses to worry about.

Then again, I don't design GPUs for a living, so...

I suspect there's probably a way to render complex scenes in multiple 
passes such that you can do batch rendering. I'm not sure if it'll ever 
scale to realtime.

(Doesn't Blender or something have an optional unbiased rendering engine 
for the GPU?)

Post a reply to this message