![](/i/fill.gif) |
![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
Thorsten Froehlich <tho### [at] trf de> wrote:
> I have no intention to continue such an argument on semantics, this leads
> nowhere and does not change the fact that x87 FPU usage is deprecated.
So trigonometric and logarithmic functions must be calculated in
software mode now? This is supposed to be a step forward?
--
- Warp
Post a reply to this message
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
clipka wrote:
> Thorsten Froehlich <tho### [at] trf de> wrote:
>>> You are looking at the wrong manual. This manual does not tell you how
>>> to do something but what is available.
>> ...
>> I guess it would help if M$ would not call x86-64 "AMD64" in older documents...
>
> I'm not surprised that they did, because that's what the original implementation
> was named...
>
>>
<http://download.microsoft.com/download/5/b/5/5b5bec17-ea71-4653-9539-204a672f11cf/AMD64_PortApp.doc>
>> <http://developer.amd.com/pages/62720069_4.aspx>
>
> I still don't get the point.
x87 is "bad" because the FPU stack architecture (effectively only having two
registers) makes it next to impossible to schedule instructions in a
compiler in any reasonable way. SSE provides eight registers instead,
reducing register pressure and hence making floating-point computations more
efficient.
That is why AMD/MS defined x86-64 to pass floating-point values by SSE
registers. This change enables all kinds of common optimizations the non-x86
world has enjoyed for decades. Of course, all this has nothing to really do
with x86-64, it just happens that as an ABI change was needed, this old
issue has been fixed while also adding 64-bit integer support.
> There is no mention that SSE2 could do the same transcendental functions without
> some additional piece of software, and I wonder whether that will do good to
> performance.
In essence x86 is the only architecture where more than sqrt is still
supported in microcode hardware. Doing this in software is much more
desirable and efficient.
Thorsten
Post a reply to this message
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
Warp wrote:
> Thorsten Froehlich <tho### [at] trf de> wrote:
>> Warp wrote:
>>> clipka <nomail@nomail> wrote:
>>>> Duh. Any mention of such things as sin, cos, log or such? I don't see any.
>>> I don't see them either, eg. here:
>>>
>>> http://en.wikipedia.org/wiki/X86_instruction_listings#SIMD_instructions
>
>> Why do you *expect* to see them?
>
> Because you said that, according to Intel, x87 is obsolete and all code
> is recommended to use SSE instead. If that's true, then it would be a
> rather large setback for programs requiring trigonometric and logarithmic
> calculations, as they would have to be made in software.
Clearly you do not know much about floating-point units in modern processors
then. You actually want to do it is software because that is more efficient
(see my other post). x87 is pretty much the last architecture to still have
microcode ops for more than sqrt.
Thorsten
Post a reply to this message
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
Warp wrote:
> clipka <nomail@nomail> wrote:
>> "x87 arithmetic is deprecated in 64-bit mode" - but not because it would
>> be inferior, but simply because "the extent to which operating systems will
>> continue to support x87 in the future is unknown."
>
> That's a really odd statement. I didn't know that the FPU or the x87
> instruction set required OS support. (Can the OS even force programs to
> not use x87 instructions if it wanted to?
Yes, by not saving and restoring the x87 "register" stack when switching
threads or making operating system calls. You need OS support for that.
Thorsten
Post a reply to this message
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
> Besides the basic operations there's sqrt, but no trigonometric nor
> logarithmic functions. x87 has those (although only base-2 logarithms
> are supported directly).
Only base-2 logarithms? Hey, big deal :)
Base-x logarithms (with constant x) are easily done computing the base-2
logarithm and dividing the result by a constant (namely the base-2 logarithm of
x).
But I guess I'm not telling you any news here.
(I was a bit surprised to find that the supported base is 2, not e - but
thinking about again it's straightforward of course, and I wouldn't be
surprised if computing log2 would actually be a piece of cake on a binary
computer, maybe even easier than a sqrt.)
Post a reply to this message
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
Warp <war### [at] tag povray org> wrote:
> That's a really odd statement. I didn't know that the FPU or the x87
> instruction set required OS support. (Can the OS even force programs to
> not use x87 instructions if it wanted to? Even if it could, it would have
> to be deliberate, and I can't really understand *why* it would want to
> do that. It would certainly break like 99% of software out there, and for
> what reason?)
Faster task switching, and less mem requirements for task information.
Currently, a task switch needs to save x87 FPU registers.
Post a reply to this message
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
Thorsten Froehlich <tho### [at] trf de> wrote:
> I have no intention to continue such an argument on semantics, this leads
> nowhere and does not change the fact that x87 FPU usage is deprecated.
.... which in turn does not change the fact either that modern CPUs (still) have
a dedicated command to compute a square root (and not only that, but also for
trigononetrics and the like), which was a question previously raised, and
answered by me by referring to the x87 FPU instruction set, while at the same
time maintaining that they're not a "naive" hardware implementation.
I still have doubts whether using SSE2 roots/trigonometrics etc. is really
faster than using the FPU, or whether the compiler really does not use them -
at least on 32-bit systems. Note that the deprecation statement relates to
AMD64, not x86 in general as far as I can see.
So again, in what way do your statements invalidate mine?
Post a reply to this message
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
Thorsten Froehlich <tho### [at] trf de> wrote:
> In essence x86 is the only architecture where more than sqrt is still
> supported in microcode hardware. Doing this in software is much more
> desirable and efficient.
It may be more efficient in terms of microprocessor die use - a waste of
silicon, so to speak. But while it's there, I doubt that it is inefficient to
use it.
As it is yet another kind of execution unit, using it in *addition* to SSE2 may
even be another notch faster.
Post a reply to this message
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
Thorsten Froehlich <tho### [at] trf de> wrote:
> Clearly you do not know much about floating-point units in modern processors
> then. You actually want to do it is software because that is more efficient
> (see my other post). x87 is pretty much the last architecture to still have
> microcode ops for more than sqrt.
You are telling me that calculating trigonometric functions on 64-bit
floating point values in software is faster than using the FPU?
I'm not sure that would make too much sense. It would mean that they
would *deliberately* make the FPU calculate those functions in a less
efficient way than you could do with the CPU.
--
- Warp
Post a reply to this message
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
clipka <nomail@nomail> wrote:
> Only base-2 logarithms? Hey, big deal :)
> Base-x logarithms (with constant x) are easily done computing the base-2
> logarithm and dividing the result by a constant (namely the base-2 logarithm of
> x).
In fact, the opcode which calculates the base-2 logarithm is given a
factor. That factor is, rather obviously, the logarithm of the real base
you want it to calculate. So you get, in fact, a logarithm in *any* base
with one single opcode.
Btw, another advantage of using the FPU rather than calculating in
software is that you could, at least in theory, have the FPU calculating
your operation while the CPU does other (non-FPU) operations at the same
time. I don't know if any compiler is able to opimize like this, though.
(Of course the same is probably true of the SSE unit as well.)
--
- Warp
Post a reply to this message
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |