POV-Ray : Newsgroups : povray.beta-test : Radiosity Status: Giving Up... Server Time
29 Jul 2024 10:20:22 EDT (-0400)
  Radiosity Status: Giving Up... (Message 101 to 110 of 194)  
<<< Previous 10 Messages Goto Latest 10 Messages Next 10 Messages >>>
From: clipka
Subject: Re: Radiosity Status: Giving Up...
Date: 31 Dec 2008 20:20:01
Message: <web.495c1a3acd9d1e7530acaf600@news.povray.org>
Thorsten Froehlich <tho### [at] trfde> wrote:
> I have no intention to continue such an argument on semantics, this leads
> nowhere and does not change the fact that x87 FPU usage is deprecated.

.... which in turn does not change the fact either that modern CPUs (still) have
a dedicated command to compute a square root (and not only that, but also for
trigononetrics and the like), which was a question previously raised, and
answered by me by referring to the x87 FPU instruction set, while at the same
time maintaining that they're not a "naive" hardware implementation.

I still have doubts whether using SSE2 roots/trigonometrics etc. is really
faster than using the FPU, or whether the compiler really does not use them -
at least on 32-bit systems. Note that the deprecation statement relates to
AMD64, not x86 in general as far as I can see.

So again, in what way do your statements invalidate mine?


Post a reply to this message

From: clipka
Subject: Re: Radiosity Status: Giving Up...
Date: 31 Dec 2008 20:30:01
Message: <web.495c1b94cd9d1e7530acaf600@news.povray.org>
Thorsten Froehlich <tho### [at] trfde> wrote:
> In essence x86 is the only architecture where more than sqrt is still
> supported in microcode hardware. Doing this in software is much more
> desirable and efficient.

It may be more efficient in terms of microprocessor die use - a waste of
silicon, so to speak. But while it's there, I doubt that it is inefficient to
use it.

As it is yet another kind of execution unit, using it in *addition* to SSE2 may
even be another notch faster.


Post a reply to this message

From: Warp
Subject: Re: Radiosity Status: Giving Up...
Date: 31 Dec 2008 20:30:47
Message: <495c1cc7@news.povray.org>
Thorsten Froehlich <tho### [at] trfde> wrote:
> Clearly you do not know much about floating-point units in modern processors 
> then. You actually want to do it is software because that is more efficient 
> (see my other post). x87 is pretty much the last architecture to still have 
> microcode ops for more than sqrt.

  You are telling me that calculating trigonometric functions on 64-bit
floating point values in software is faster than using the FPU?

  I'm not sure that would make too much sense. It would mean that they
would *deliberately* make the FPU calculate those functions in a less
efficient way than you could do with the CPU.

-- 
                                                          - Warp


Post a reply to this message

From: Warp
Subject: Re: Radiosity Status: Giving Up...
Date: 31 Dec 2008 20:37:04
Message: <495c1e40@news.povray.org>
clipka <nomail@nomail> wrote:
> Only base-2 logarithms? Hey, big deal :)

> Base-x logarithms (with constant x) are easily done computing the base-2
> logarithm and dividing the result by a constant (namely the base-2 logarithm of
> x).

  In fact, the opcode which calculates the base-2 logarithm is given a
factor. That factor is, rather obviously, the logarithm of the real base
you want it to calculate. So you get, in fact, a logarithm in *any* base
with one single opcode.

  Btw, another advantage of using the FPU rather than calculating in
software is that you could, at least in theory, have the FPU calculating
your operation while the CPU does other (non-FPU) operations at the same
time. I don't know if any compiler is able to opimize like this, though.

  (Of course the same is probably true of the SSE unit as well.)

-- 
                                                          - Warp


Post a reply to this message

From: Warp
Subject: Re: Radiosity Status: Giving Up...
Date: 31 Dec 2008 20:39:34
Message: <495c1ed6@news.povray.org>
Thorsten Froehlich <tho### [at] trfde> wrote:
> >   That's a really odd statement. I didn't know that the FPU or the x87
> > instruction set required OS support. (Can the OS even force programs to
> > not use x87 instructions if it wanted to?

> Yes, by not saving and restoring the x87 "register" stack when switching 
> threads or making operating system calls. You need OS support for that.

  That would be a rather broken OS.

-- 
                                                          - Warp


Post a reply to this message

From: Warp
Subject: Re: Radiosity Status: Giving Up...
Date: 31 Dec 2008 20:45:08
Message: <495c2024@news.povray.org>
Thorsten Froehlich <tho### [at] trfde> wrote:
> In essence x86 is the only architecture where more than sqrt is still 
> supported in microcode hardware. Doing this in software is much more 
> desirable and efficient.

  So they really are making the FPU deliberately less efficient than
it could be?

  How do you calculate the sine and cosine of a 64-bit floating point
value in 17-137 clock cycles in software?

-- 
                                                          - Warp


Post a reply to this message

From: Warp
Subject: Re: Radiosity Status: Giving Up...
Date: 31 Dec 2008 20:46:55
Message: <495c208f@news.povray.org>
clipka <nomail@nomail> wrote:
> As it is yet another kind of execution unit, using it in *addition* to SSE2 may
> even be another notch faster.

  At least in the past Intel processors had the strange rule that you cannot
use the FPU and the SSE unit at the same time. I don't know if they have
fixed that limitation later.

-- 
                                                          - Warp


Post a reply to this message

From: clipka
Subject: Re: Radiosity Status: Giving Up...
Date: 31 Dec 2008 21:00:01
Message: <web.495c2262cd9d1e7530acaf600@news.povray.org>
Warp <war### [at] tagpovrayorg> wrote:
>   You are telling me that calculating trigonometric functions on 64-bit
> floating point values in software is faster than using the FPU?
>
>   I'm not sure that would make too much sense. It would mean that they
> would *deliberately* make the FPU calculate those functions in a less
> efficient way than you could do with the CPU.

.... or deliberately not spending so much effort on improving these functons,
while the rest of the CPU is constantly overhauled to make it faster.

In the good old days, when CPU manufacturers still published information like
the machine cycles per instruction, it was not too uncommon for a few
instructions to actually take more cycles on newer CPUs.

Actually, the complexity of the x87 FPU may make it harder to optimize than
other parts of the CPU, and there may also be issues regarding the
interoperation with other speedup mechanisms, like jump prediction or
what-have-you.

So while I still doubt whether SSE2 based software can achieve the same speed as
the x87 FPU, it is not *too* far-fetched either. And maybe there's even room for
parallelization of some computations in transcendent functions that I don't see
right now.


(BTW, did you know that during a single clock cycle of your CPU, a light ray
actually travels no more than about 10 cm? And electrons are typically still a
deal slower.)


Post a reply to this message

From: clipka
Subject: Re: Radiosity Status: Giving Up...
Date: 31 Dec 2008 21:05:01
Message: <web.495c24a9cd9d1e7530acaf600@news.povray.org>
Warp <war### [at] tagpovrayorg> wrote:
>   In fact, the opcode which calculates the base-2 logarithm is given a
> factor. That factor is, rather obviously, the logarithm of the real base
> you want it to calculate. So you get, in fact, a logarithm in *any* base
> with one single opcode.

Hm.. I confess I didn't have such a close look at the instructions.

>   Btw, another advantage of using the FPU rather than calculating in
> software is that you could, at least in theory, have the FPU calculating
> your operation while the CPU does other (non-FPU) operations at the same
> time. I don't know if any compiler is able to opimize like this, though.

I guess most compilers do this. The x87 FPU has always been working in parallel
to the CPU, so the compiler architects had quite some years to figure out how
to achieve this.

>   (Of course the same is probably true of the SSE unit as well.)

Yup. But it will require more help from the CPU when doing complex tasks like
computing transcendental stuff, which will place some limitation on what you
can do at the same time.


Post a reply to this message

From: clipka
Subject: Re: Radiosity Status: Giving Up...
Date: 31 Dec 2008 21:25:00
Message: <web.495c288dcd9d1e7530acaf600@news.povray.org>
Warp <war### [at] tagpovrayorg> wrote:
> > Yes, by not saving and restoring the x87 "register" stack when switching
> > threads or making operating system calls. You need OS support for that.
>
>   That would be a rather broken OS.

Not if this was part of the OS specification.

For compiler developers, this would be a major change. For most software, it
would be peanuts, compared to all the other stuff that usually needs to be
changed when porting to a different OS: Re-compile, and you have it. The basic
math stuff like sin, cos etc. is part of the C run-time library anyway, which
is basically part of the compiler.

The fact that new Windows versions typically support software compiled for the
previous version is a good marketing strategy, but not a necessity, and
expecting software compiled today to run on the Windows to come in 10 years
would be naive.


Post a reply to this message

<<< Previous 10 Messages Goto Latest 10 Messages Next 10 Messages >>>

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.