POV-Ray: Newsgroups: povray.beta-test: Radiosity Status: Giving Up...: Re: Radiosity Status: Giving Up...

POV-Ray : Newsgroups : povray.beta-test : Radiosity Status: Giving Up... : Re: Radiosity Status: Giving Up...		Server Time 3 Jul 2025 09:22:38 EDT (-0400)
From: andrel
Date: 31 Dec 2008 05:41:04
Message: <495B4CA0.2030302@hotmail.com>
On 31-Dec-08 4:55, clipka wrote:
> andrel <a_l### [at] hotmailcom> wrote:
>> sqrt has at worst N/2 iteration, with N the number of significant bits.
>> That is a naive implementation. I thought there are even faster
>> algorithms. Cheap calculators often have a button for sqrt, implying
>> probably that they have a (naive) hardware implementation. I don't know
>> if it is available also on more complex processors.
> 
> From the "Intel(R) 64 and IA-32 Architectures Software Developer's Manual,
> Volume 1: Basic Architecture":
> 
> ------------------------
> 8.3.5 Basic Arithmetic Instructions
> 
> The following floating-point instructions perform basic arithmetic operations on
> floating-point numbers. Where applicable, these instructions match IEEE Standard
> 754:
> 
> FADD/FADDP Add floating point
> FIADD Add integer to floating point
> FSUB/FSUBP Subtract floating point
> FISUB Subtract integer from floating point
> FSUBR/FSUBRP Reverse subtract floating point
> FISUBR Reverse subtract floating point from integer
> FMUL/FMULP Multiply floating point
> FIMUL Multiply integer by floating point
> FDIV/FDIVP Divide floating point
> FIDIV Divide floating point by integer
> FDIVR/FDIVRP Reverse divide
> FIDIVR Reverse divide integer by floating point
> FABS Absolute value
> FCHS Change sign
> FSQRT Square root                          <--
> FPREM Partial remainder
> FPREM1 IEEE partial remainder
> FRNDINT Round to integral value
> FXTRACT Extract exponent and significand
> 
> [...]
> 
> 8.3.7 Trigonometric Instructions
> 
> The following instructions perform four common trigonometric functions:
> 
> FSIN Sine
> FCOS Cosine
> FSINCOS Sine and cosine
> FPTAN Tangent
> FPATAN Arctangent
> 
> [...]
> 
> 8.3.9 Logarithmic, Exponential, and Scale
> 
> The following instructions provide two different logarithmic functions, an
> exponential function and a scale function:
> 
> FYL2X Logarithm
> FYL2XP1 Logarithm epsilon
> F2XM1 Exponential
> FSCALE Scale
> 
> [...]
> 
> ------------------------

I didn't know even sin and cos are now hardware supported, interesting. 
That will enable you to pinpoint the last time I did some assembler 
programming to within two years. Another source gives the 8087 as the 
first one to implement FSQRT and the 80387 as the first one for the FSIN 
etc. That makes sense.
If we could now also find the average timing of these instructions, we 
might even be able to answer Warp's question on how much faster SQRT is 
compared to e.g. sin.

> 
> Don't expect all these to be "naive hardware implementation" in the same sense
> as, say, an integer addition, shift, bit-wise AND/OR/XOR or whatever. Even a
> floating-point addition is a non-trivial thing. Rather consider the FPU
> (Floating Point Unit) a computer in the computer: You feed some data into it,
> give it an instruction what to do with that data, and then basically wait for
> the (fast and highly optimized) hard-wired FPU "programlets" to complete. Or do
> some other usefull stuff in the meantime.

Microcode may be the word you are looking for.

> To my knowledge, a lot of work is saved by using Look-up-tables (like people
> used in old times before pocket calculators had a "sin", "log" or "sqrt"
> button) to optimize away a few iterations of the algorithms.
> 
> BTW, your typical pocket calculator will do just about the same. AFAIK there is
> no non-iterative algorithm for computing the square root of an arbitrary
> floating point number.

It is an iterative procedure but you need the same hardware as for a 
divide. So when you build something that can do division in hardware 
with some shift registers and a comparator, there is bound to be some 
space in the chip left to add the few gates to do a SQRT also.
You can do a SQRT and a division in pure hardware without microcode.
I wouldn't be surprised if current FPUs do use some clever tricks in 
microcode to speed up the SQRT by using a more sophisticated algorithm. 
But, as I said, I am doing not much programming close to the machine 
anymore these days.
Hmm, I have not even read Dr Dobbs for a few years. Been thinking of 
renewing my subscription for almost as long, never came round to it. 
Let's spend some cash before the year ends.
Post a reply to this message