POV-Ray: Newsgroups: povray.advanced-users: single precision vs double precision: Re: single precision vs double precision

POV-Ray : Newsgroups : povray.advanced-users : single precision vs double precision : Re: single precision vs double precision		Server Time 1 Jul 2025 18:32:54 EDT (-0400)

From: Le Forgeron
Date: 16 Nov 2012 04:57:25
Message: <50a60e05@news.povray.org>

Le 16/11/2012 09:47, Bruno Cabasson a écrit :

> Let me put it differently. Since today's GPUs' native format is single
> precision, which is much faster than their emulated double precision (AFAIK),
> and if POV-Ray were to use GPU or hardware-accelrated computing (one day ...,
> see a recent post on the subject), I was curious to know where double precision
> is really necessary.
> 
> 

GPU are great when running the *same* code over multiple data.
Which, for all the mesh-only-based renderers is great too: take all the
mesh points/normal/wtf data and apply an operation on it.

(but even using a GPU put constraints on the data organisation: if you
want to find which triangle is in a clipping zone, you just do not get
back a short list of triangles. You get the full list of triangles with
a bit (or more) in each triangle about the result of your processing.)

In term of generic code (latency and such)
1. Preparing the code to run on GPU: costly.
2. Running the code on GPU: cheap
3. Using the result: irrelevant

Changing the code of 1 is expensive. But with today screensize &
expectation, using a mesh with 10x or 30x more points, thanks to 2, is
not a problem.

A GPU is like a factoring tool to make screws and/or nail.
Making 20 final objects per second is ok, as long as all objects are the
same. Changing the production from a nail of length 2cm, diameter 1mm
with circular flat head of 3mm to a different nail or a screw is
something that could take a few minutes, hours or a day.

If you have a single floating point operation to perform, the direct
handling by the CPU will be faster than the same on a GPU: because the
CPU either:
 1. take the floating points data from memory to registers
 2. perform the maths
 3. store the result back in memory/cache


vs
 1. get the GPU resources
 2. make the GPU load the relevant code for the operation
 3. make the GPU access/load the data
 4. get signaled that the data have been processed

The implicit next step is:
 *. use the result

on CPU path, the result is already in a register/cache
on CPU+GPU path, loading the result will take a few cycles too.
(on mesh-only-renders, the target is to avoid using the CPU on the data,
at all! Input the mesh-data, the lights & so, and computes different
partial images with the GPU (ambient, 1st reflection..., shadows, ...),
then get back either on GPU or CPU to combines these partial
contributions in a single image.

Post a reply to this message