|
|
virtualmeet wrote:
> Hi,
> I confirm you that I'm not doing any approximation and that teaching
> computer to take advantage from the geometrical shape is much more easy
> than it looks like...
...
> suppose now that we have to compute an isosurface f(x,y,z) = cos(x)
+ cos(y)
> + cos(z).
> If we define g(u) = cos(u), then our isosurface f can be changed to this :
> F(x, y, z, x', z', z') = x' + y' + z'.
>
> Isosurfaces f and F are exactly the same and have the same final shape and
> there is no approximation in the final shape made by F. However, F is much
> more faster to compute than f because it's only made by 2 additions of 3
> varibles where f is a dumb calculation of 3 cosinus and 2 additions for
> the ENTIRE 3D grid.
You ARE making approximations. You're not approximating the underlying
shape, however, but you are approximating the values of the cos() function.
In fact, this technique is extremely old, and was widely used as
recently as 10 years ago. Since then, computers have become so much
faster that most people are willing to live with the speed hit in order
to get the greater accuracy offered by using native CPU instructions.
Remember, POV-Ray is extremely dependent on accuracy. It has been shown
in the past that even the step down from double to single precision
produces unacceptable artifacts in the result, so I'm sure you can
imagine what a linear interpolation lookup table would do.
Also, this kind of method can wreak havoc (and actually cause a speed
penalty) in multicore architectures. First, this data needs to be held
in memory (and to be truly effective, it *MUST* fit within the L1 cache,
otherwise it will be slower than the native function call). Fine when
you have a few hundred values, even a few thousand. But how many values
do you need for the accuracy to be acceptable? 100,000? 1,000,000,000?
OK, let's assume for the moment that you only need 10,000 values for
this function. Each one is a double precision float, meaning it takes
up 64 bits or 8 bytes, so you're taking up 80,000 bytes with this lookup
table. That's 78k right there, and several CPUs still on the market
have a smaller L1 cache than that.
So, let's be extremely generous, and say that only 1,000 values will be
needed for accuracy. That's still 7k *per* *function*, and how many
functions do you need to store this for? Then of course, you have the
2D functions, which still take up *7.6MB* *each*. You'd need a Xeon to
have a cache that large, even then it's the L3 cache, which is
considerably slower than the L1.
Of course, you're forgetting that the cache is used for holding more
than just your lookup tables, so by using it up you're effectively
preventing the cache from doing whatever else it would normally be doing.
Now, please understand me. I'm not saying that you can't get a speed
increase with this method. I'm saying that if you want anything even
close to accurate enough for POV-Ray, your dataset would be too large to
be usable, and you would be better off with the original functions.
...Chambers
Post a reply to this message
|
|