POV-Ray: Newsgroups: povray.beta-test: Radiosity Status: Giving Up...: Re: Radiosity Status: Giving Up...

POV-Ray : Newsgroups : povray.beta-test : Radiosity Status: Giving Up... : Re: Radiosity Status: Giving Up...		Server Time 14 Jul 2025 19:30:45 EDT (-0400)

From: clipka
Date: 31 Dec 2008 10:20:01
Message: <web.495b8d2bcd9d1e75483cfa400@news.povray.org>

Thorsten Froehlich <tho### [at] trfde> wrote:
> clipka wrote:
> > Don't expect all these to be "naive hardware implementation" in the same sense
> > as, say, an integer addition, shift, bit-wise AND/OR/XOR or whatever.
>
> Exactly that is why you ought to be looking at the SSE2/3 floating-point
> registers and associated hardware support. The x87 FPU is only there for
> legacy support and rather inefficient.

Hah! Say that again...

SSE3, SSSE3 to SSE4 is rather primitive compared to what the x87 FPU can do -
except when it comes to bulk add, subtract, multiply or divide. Which is what
they're designed for: Vectors and matrices. That's why they're called Streaming
SIMD (= Single Instruction Multiple Data) Extensions.

Search for trigonometric or logarithmic functions - you'll not find any in the
SSE2 or SSE3 sections. You'll probably find that these still rely on good old
x87 FPU instructions.

>  > Even a floating-point addition is a non-trivial thing.
>
> Actually, it is not more complex than integer addition and multiplication.

- Check for NaNs, infinities and other such things
- Normalize the smaller number to match the larger one's exponent
- Add the mantissae
- Check for mantissa overflow, re-normalizing the number if necessary
- Check for number format overflow

Doesn't sound as trivial to me as shoving two sets of bits into an array of
properly wired bit adders with carry in- and outputs, and then reading their
output lines.

IMUL is a different thing already. We're talking about something here that I'd
probably not want to implement in pure, non-clocked hardware.

Post a reply to this message