|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
The SSE2 optimizations found in the Windows port of POVRay don't seem to
show up in the Linux port. Why is this or am I missing something?
Thanks,
Augustus
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On Wed, 4 Sep 2002 10:58:45 EDT, Augustus <nomail@nomail> wrote:
> The SSE2 optimizations found in the Windows port of POVRay don't seem to
> show up in the Linux port. Why is this or am I missing something?
did compiler used to generate win binary generate SSE2 code or
does windows version include hand-coded asm optimizations?
at least with ICC 6 only one out of 10000000 executed instructions
was MMX one when I tested it some time ago (icc -tpp6 -xM).
unfortunately I can't test SSE2 stuff.
--
Safari - y7p### [at] sneakemailcomgovinvalid
"Talk is cheap. Show me the code." - Linus Torvalds
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Augustus wrote:
> The SSE2 optimizations found in the Windows port of POVRay don't seem to
> show up in the Linux port. Why is this or am I missing something?
>
So far only Pentium4 support SSE2. I can prepare a linux binary that uses
SSE2 intructions.. but so far I have found nobody with a P4 to test it. If
you have such a beast and would like to do the testing I will make a binary
for it.
- Micha
--
http://objects.povworld.org - the POV-Ray Objects Collection
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Safari wrote:
>
> did compiler used to generate win binary generate SSE2 code or
> does windows version include hand-coded asm optimizations?
I just heard that intel has provided some p4 optimisation of the noise
function for windows compilation.
>
> at least with ICC 6 only one out of 10000000 executed instructions
> was MMX one when I tested it some time ago (icc -tpp6 -xM).
>
Yes, the problem is that mmx works for 32-bit FP only. So it can be used
for colour stuff only. SSE would allow to operate on a vector of 4 32-bit
floats simultaniously. Now most colour operation are on 3 components only,
so icc will not vectorize them. Additionally it you have to use explicitly
loops to make icc use it.
> unfortunately I can't test SSE2 stuff.
>
SSE2 allows operations on vectors of 2 64-bit FP. This could be helpful for
all the vector operations. But again it needs the use of loops e.g. in
vector.h to make icc use it. Unfortunatly I can't test it either.
- Micha
--
http://objects.povworld.org - the POV-Ray Objects Collection
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Let's go for it. I'm working on a review on LinuxHardware.org of a dual
Xeon system vs. a dual Athlon MP system and would love to see the advantage
of SSE2 instructions. If you can get me the binary it would be great or
tell me how to compile it myself (or both). You can reach me directly at
aug### [at] linuxhardwareorg.
Thanks
Micha Riser wrote:
>Augustus wrote:
>
>> The SSE2 optimizations found in the Windows port of POVRay don't seem to
>> show up in the Linux port. Why is this or am I missing something?
>>
>
>So far only Pentium4 support SSE2. I can prepare a linux binary that uses
>SSE2 intructions.. but so far I have found nobody with a P4 to test it. If
>you have such a beast and would like to do the testing I will make a binary
>for it.
>
>- Micha
>
>http://objects.povworld.org - the POV-Ray Objects Collection
>
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
In article <web.3d765724e7aba3891c1896010@news.povray.org> , "Augustus"
<nomail@nomail> wrote:
> I'm working on a review on LinuxHardware.org of a dual
> Xeon system vs. a dual Athlon MP system
You are aware that POV-Ray 3.5 is single-threaded, right?
Thorsten
____________________________________________________
Thorsten Froehlich, Duisburg, Germany
e-mail: tho### [at] trfde
Visit POV-Ray on the web: http://mac.povray.org
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
>
> So far only Pentium4 support SSE2. I can prepare a linux binary that uses
> SSE2 intructions.. but so far I have found nobody with a P4 to test it. If
I can test it.
P4 1.8GHz
> - Micha
--
________
_/ __/ __/ Support wild life - vote for an orgy.
\__ \__ \_______________________________________________________________
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Yeah I'm aware. It's still a good benchmark for the CPU itself though.
Augustus
Thorsten Froehlich wrote:
>In article <web.3d765724e7aba3891c1896010[at]news.povray.org> , "Augustus"
><nomail[at]nomail> wrote:
>
>> I'm working on a review on LinuxHardware.org of a dual
>> Xeon system vs. a dual Athlon MP system
>
>You are aware that POV-Ray 3.5 is single-threaded, right?
>
> Thorsten
>
>
>____________________________________________________
>Thorsten Froehlich, Duisburg, Germany
>e-mail: tho### [at] trfde
>
>Visit POV-Ray on the web: http://mac.povray.org
>
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
S?awomir Szczyrba wrote:
> I can test it.
>
OK, binaries available form http://www.povworld.org/povray/binaries.html
Happy testing :) (Hope they do not crash all)
- Micha
--
http://objects.povworld.org - the POV-Ray Objects Collection
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
>
> OK, binaries available form http://www.povworld.org/povray/binaries.html
> Happy testing :) (Hope they do not crash all)
>
They work well :)
Here are some results. Very intresting, IMHO :)
"outpost.pov" is my own scene from last IRTC entry.
Rest of them are randomly chosen scenes from distribution.
Red Hat Linux release 7.2 (Enigma) Kernel 2.4.18 (i686)
Intel(R) Pentium(R) 4 CPU 1.80GHz 512kB cache (3604.48 bogomips)
512MB RAM(DDR)
File: benchmark.pov
Vanilla : 8440 seconds
P4.nosse2 : 3447 seconds
P4.sse2 : 3799 seconds
P4.noiseopt : 4034 seconds
File: outpost.pov
Vanilla : 1005 seconds
P4.nosse2 : 388 seconds
P4.sse2 : 420 seconds
P4.noiseopt : 496 seconds
File: micro.pov
Vanilla : 494 seconds
P4.nosse2 : 215 seconds
P4.sse2 : 395 seconds
P4.noiseopt : 394 seconds
File: crystal.pov
Vanilla : 14 seconds
P4.nosse2 : 5 seconds
P4.sse2 : 7 seconds
P4.noiseopt : 6 seconds
File: caustic2.pov
Vanilla : 4 seconds
P4.nosse2 : 1 seconds
P4.sse2 : 2 seconds
P4.noiseopt : 1 seconds
File: chesmsh.pov
Vanilla : 4 seconds
P4.nosse2 : 2 seconds
P4.sse2 : 2 seconds
P4.noiseopt : 2 seconds
File: chess2.pov
Vanilla : 275 seconds
P4.nosse2 : 134 seconds
P4.sse2 : 151 seconds
P4.noiseopt : 151 seconds
File: crystal.pov
Vanilla : 14 seconds
P4.nosse2 : 6 seconds
P4.sse2 : 6 seconds
P4.noiseopt : 6 seconds
File: interior_texture.pov
Vanilla : 7 seconds
P4.nosse2 : 3 seconds
P4.sse2 : 3 seconds
P4.noiseopt : 3 seconds
File: media1.pov
Vanilla : 21 seconds
P4.nosse2 : 14 seconds
P4.sse2 : 12 seconds
P4.noiseopt : 12 seconds
File: media2.pov
Vanilla : 101 seconds
P4.nosse2 : 56 seconds
P4.sse2 : 52 seconds
P4.noiseopt : 57 seconds
File: media3.pov
Vanilla : 139 seconds
P4.nosse2 : 98 seconds
P4.sse2 : 108 seconds
P4.noiseopt : 107 seconds
File: media4.pov
Vanilla : 14 seconds
P4.nosse2 : 16 seconds
P4.sse2 : 19 seconds
P4.noiseopt : 18 seconds
File: media5.pov
Vanilla : 11 seconds
P4.nosse2 : 11 seconds
P4.sse2 : 13 seconds
P4.noiseopt : 12 seconds
File: noise_generator.pov
Vanilla : 9 seconds
P4.nosse2 : 3 seconds
P4.sse2 : 3 seconds
P4.noiseopt : 4 seconds
File: norm_acc.pov
Vanilla : 14 seconds
P4.nosse2 : 5 seconds
P4.sse2 : 6 seconds
P4.noiseopt : 6 seconds
File: radiosity2.pov
Vanilla : 46 seconds
P4.nosse2 : 25 seconds
P4.sse2 : 31 seconds
P4.noiseopt : 28 seconds
File: radiosity3.pov
Vanilla : 36 seconds
P4.nosse2 : 20 seconds
P4.sse2 : 24 seconds
P4.noiseopt : 24 seconds
File: radiosity.pov
Vanilla : 45 seconds
P4.nosse2 : 20 seconds
P4.sse2 : 24 seconds
P4.noiseopt : 23 seconds
> - Micha
--
\__ \__ \_______________________________________________________________
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |