|
![](/i/fill.gif) |
Le 16/11/2012 09:47, Bruno Cabasson a écrit :
> Let me put it differently. Since today's GPUs' native format is single
> precision, which is much faster than their emulated double precision (AFAIK),
> and if POV-Ray were to use GPU or hardware-accelrated computing (one day ...,
> see a recent post on the subject), I was curious to know where double precision
> is really necessary.
>
>
GPU are great when running the *same* code over multiple data.
Which, for all the mesh-only-based renderers is great too: take all the
mesh points/normal/wtf data and apply an operation on it.
(but even using a GPU put constraints on the data organisation: if you
want to find which triangle is in a clipping zone, you just do not get
back a short list of triangles. You get the full list of triangles with
a bit (or more) in each triangle about the result of your processing.)
In term of generic code (latency and such)
1. Preparing the code to run on GPU: costly.
2. Running the code on GPU: cheap
3. Using the result: irrelevant
Changing the code of 1 is expensive. But with today screensize &
expectation, using a mesh with 10x or 30x more points, thanks to 2, is
not a problem.
A GPU is like a factoring tool to make screws and/or nail.
Making 20 final objects per second is ok, as long as all objects are the
same. Changing the production from a nail of length 2cm, diameter 1mm
with circular flat head of 3mm to a different nail or a screw is
something that could take a few minutes, hours or a day.
If you have a single floating point operation to perform, the direct
handling by the CPU will be faster than the same on a GPU: because the
CPU either:
1. take the floating points data from memory to registers
2. perform the maths
3. store the result back in memory/cache
vs
1. get the GPU resources
2. make the GPU load the relevant code for the operation
3. make the GPU access/load the data
4. get signaled that the data have been processed
The implicit next step is:
*. use the result
on CPU path, the result is already in a register/cache
on CPU+GPU path, loading the result will take a few cycles too.
(on mesh-only-renders, the target is to avoid using the CPU on the data,
at all! Input the mesh-data, the lights & so, and computes different
partial images with the GPU (ambient, 1st reflection..., shadows, ...),
then get back either on GPU or CPU to combines these partial
contributions in a single image.
Post a reply to this message
|
![](/i/fill.gif) |