![](/i/fill.gif) |
![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
> They should be compiled with all the same options, except for SSE2
> optimizations in order for the measurement to be reliable.
How about that one? http://pov4grasp.free.fr/articles/fastpov1/
(Gee, that's already 4 years old... Yet I never found the time to actually
write down the second part that heavily deals with profile-guided optimization
*sigh*)
- NC
Post a reply to this message
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
Thorsten Froehlich wrote:
> You are looking at the wrong manual. This manual does not tell you how
> to do something but what is available. I admit the Intel documentation
> is not clear, but x87 usage is deprecated. This is documented for x86-64
> mode in OS vendor information because the x86-64 ABIs even use the SSE
> registers for argument passing (no more x87 FPU stack or memory mapped
> argument passing). But googling for that information is difficult. One
> of the top-most useful links I found was
> <http://msdn.microsoft.com/en-us/library/bb147385.aspx#ID0EBEAA> - you
> will have to look up the remaining information yourself. I guess AMD
> might have more info, as they came up with x86-64.....
I guess it would help if M$ would not call x86-64 "AMD64" in older documents...
<http://download.microsoft.com/download/5/b/5/5b5bec17-ea71-4653-9539-204a672f11cf/AMD64_PortApp.doc>
<http://developer.amd.com/pages/62720069_4.aspx>
Some ABI changes exist on Linux and Mac OS X. Thus, it is SSE all the way
everywhere.
Thorsten
Post a reply to this message
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
clipka wrote:
> "nemesis" <nam### [at] gmail com> wrote:
>> Damn! Isn't it exciting to see this much talk about actual povray code
>> and
>> improvement rather than just read Orchid's blog posts all day? No
>> offense, Andrew! :D
>
> Who is Orchid? Have I missed something...?
Have a quick look at povray.off-topic group. (but don't stay in there for
too long, it's bad for your sanity; get back to radiosity! :D)
Post a reply to this message
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
Nicolas Calimet <pov### [at] free fr> wrote:
> > They should be compiled with all the same options, except for SSE2
> > optimizations in order for the measurement to be reliable.
> How about that one? http://pov4grasp.free.fr/articles/fastpov1/
On a related note, it would be cool if it was updated to contain the
results for the latest gcc 4.x line which, AFAIK, contains many additional
optimization tricks up its sleeve.
--
- Warp
Post a reply to this message
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
Thorsten Froehlich <tho### [at] trf de> wrote:
> Please check out the current Intel information what SSE2 is actually
> supposed to be used for. I am not talking about vectorized code or
> auto-vectorization.
So - what IS it actually supposed to be used for?
Basic Architecture", I find that SSE2 is useful for:
- SIMD (for processing bulk data)
- loading data without involvement of the cache (agan for bulk data)
That's about it, basically.
So? What's the news I have missed you expect me to find in "the current Intel
information what SSE2 is actually supposed to be used for" (whatever document
that is supposed to be)?
The same document also mentions that SSE2 supports basically the same arithmetic
operations as MMX, about which the doc has to say: "The arithmetic instructions
perform addition, subtraction, multiplication, and
multiply/add operations on packed data types"
Duh. Any mention of such things as sin, cos, log or such? I don't see any.
So: Back to good old x87 FPU instructions for these.
Post a reply to this message
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
Thorsten Froehlich <tho### [at] trf de> wrote:
> Warp, could you stop theorizing and actually *use* the information already
> out there, supplied by Intel and plenty of other sources? SSE != SIMD
Have you checked lately what "SSE" stands for? ;)
--
- Warp
Post a reply to this message
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
Thorsten Froehlich <tho### [at] trf de> wrote:
> > You are looking at the wrong manual. This manual does not tell you how
> > to do something but what is available.
> ...
> I guess it would help if M$ would not call x86-64 "AMD64" in older documents...
I'm not surprised that they did, because that's what the original implementation
was named...
>
<http://download.microsoft.com/download/5/b/5/5b5bec17-ea71-4653-9539-204a672f11cf/AMD64_PortApp.doc>
> <http://developer.amd.com/pages/62720069_4.aspx>
I still don't get the point. The first document is about AMD64; it does mention
SSE2, but not to much extent. Again, vectorization is the most prominent
keyword here.
The second document mentions that "The x87 FPU can perform numerous advanced
arithmetic operations on values stored in the x87 registers, such as
trigonometric and logarithmic functions, with a single instruction.", and also
that "x87 arithmetic is deprecated in 64-bit mode" - but not because it would
be inferior, but simply because "the extent to which operating systems will
continue to support x87 in the future is unknown."
There is no mention that SSE2 could do the same transcendental functions without
some additional piece of software, and I wonder whether that will do good to
performance.
Post a reply to this message
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
Nicolas Alvarez <nic### [at] gmail com> wrote:
> Have a quick look at povray.off-topic group. (but don't stay in there for
> too long, it's bad for your sanity; get back to radiosity! :D)
The deeper I get into the code and all its quirks, I wonder whether radiosity is
any better for my sanity :)
Post a reply to this message
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
clipka <nomail@nomail> wrote:
> Duh. Any mention of such things as sin, cos, log or such? I don't see any.
I don't see them either, eg. here:
http://en.wikipedia.org/wiki/X86_instruction_listings#SIMD_instructions
Besides the basic operations there's sqrt, but no trigonometric nor
logarithmic functions. x87 has those (although only base-2 logarithms
are supported directly).
--
- Warp
Post a reply to this message
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
clipka wrote:
> So? What's the news I have missed you expect me to find in "the current Intel
> information what SSE2 is actually supposed to be used for" (whatever document
> that is supposed to be)?
The only current information is in the compiler manuals that mention the ABI
changes in 64-bit mode. The other relevant information is unfortunately a
mess of documents from compiler and OS vendors, AMD, and the "Intel 64"
developer web page.
> The same document also mentions that SSE2 supports basically the same arithmetic
The document you are referring to specifies the whole instruction set
available in all Intel x86 processors. If you look at it closely, you will
even find sections about 286 compatibility and 16 bit mode. - Just because
all instructions are in that document does not mean you should use them all...
> Duh. Any mention of such things as sin, cos, log or such? I don't see any.
>
> So: Back to good old x87 FPU instructions for these.
No, you do them in software with SSE2, that is much faster. Or actually, you
don't care because the compiler does the right thing anyway.
Thorsten
Post a reply to this message
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |
| ![](/i/fill.gif) |
|
![](/i/fill.gif) |
|
![](/i/fill.gif) |
| ![](/i/fill.gif) |