|
![](/i/fill.gif) |
Chambers <ben### [at] pacificwebguy com> wrote:
> _theCardinal wrote:
> > Ben Chambers <ben### [at] pacificwebguy com> wrote:
> >> No good, they're only single precision. Plus, each shader unit would
> >> need access to the entire scene file, which would be a pain in the a**
> >> to code.
> >>
> >> ...Chambers
> >
> > According to the CUDA programming guide published by NVidia the GPGPU
> > architecture is as competent as any 32 bit processor. More precise
> > computations can be simulated using multiple registers for a computation
> > instead of a single register if my memory serves - so I seriously doubt
> > this is a serious obstacle.
>
> But double precision is actually 64bit. Until recently (I don't
> remember exactly which model), NVidia didn't even do full 32bit FP (that
> is, single precision), but only 24. POV-Ray, for the last dozen years,
> has done 64bit FP (double precision), as the extra accuracy is necessary
> for the types of computations it does.
>
> Sure, you can simulate it in the same way you can use two integer units
> to simulate a fixed point number, but the result is slow. Perhaps if
> Intel surprises everyone, and releases their next graphics chip as a
> double precision FP monsters, we'd be able to take advantage of that,
> but the current ATI / NVidia cards aren't up to the task of dealing with
> POV-Ray.
>
> > The major difficulty involved would be preparing the pov-ray source to run
> > efficiently on a SIMD architecture - native code may run out of the box
> > through the provided compiler, but the results would be poor at best
> > without optimization to take advantage of the particular memory hierarchies
> > involved.
>
> Once the 3.7 source is out, it should be much easier, as the major task
> is simply fitting it to a parallel paradigm. Having already done that,
> porting to different parallel architectures should be trivial (relative
> to the original threading support, that is).
>
> --
> ...Ben Chambers
> www.pacificwebguy.com
Few things:
"But double precision is actually 64bit."
To be technical the number of bits used for a double is implementation
dependent. The requirement is simply that a float <= double. It is up to
compiler to decide how to interpret that. Using double in lieu of float
simply indicates the desire for additional precision - not the requirement
(in C and C++). Hence it is impossible in general to say povray is using
64 bits. See: The C++ Programming Language (TCPL) 74-75.
Compilers may have more than a few techniques to simulate 64 bit computation
on a 32 bit architecture, but I am not experienced enough in compiler design
to state them within reasonable doubt. Its worth noting that the time lost
in doing 2 ops instead of 1 is easily regained in shifting from 1-2
processors to an array of processors, so this is not a concern provided the
utilization of the array is sufficiently high.
CUDA is a new beast - designed for 8800 or later generation cards by NVidia.
This means that the vast majority of cards in use today do not support it -
and probably won't for the next 2-3 years. According to the specification I
read the registers are full 32 bit, not 24 as in earlier cards. For more
detailed information google CUDA and browse the documentation provided
along with the SDK.
I would like to clarify that a dual-processor design is not likely to share
much in common with a SIMD (single instruction multiple data) architecture.
In particular - reaching optimal utilization when sets of processors are
required to share the same instruction set is not terrible easy, this
requirement doesn't exist at all for a dual-core architecture (and is one
of the reasons it moved into consumer systems). General ray-tracing does
show great potential though, since its intuitively rare to have an image
where the number of rays to evaluate varies rapidly over the image.
(Refracting through a 'bag of marbles' would be a good counterexample
though - a small shift in direction would wildly vary in the number of rays
to computer. I doubt there is any tractable deterministic method to check
if this is the case though)
Personally I am less concerned with the usefulness of the implementation
that in its experimental value.
Thanks,
Justin
Post a reply to this message
|
![](/i/fill.gif) |