|
![](/i/fill.gif) |
Out of curiousity I wanted to see exactly how the different optimization
flags in gcc affect the speed of povray (3.6.1).
Because I didn't want to wait for 30-60 minutes for the standard benchmark,
I just used scenes/advanced/abyss.pov instead.
Here are the results:
System: Pentium4 3.4GHz, 1GB RAM, Suse Linux 9.3, gcc 3.3.5
Rendering options: abyss.pov -w800 -h300 +a +am2 -f -p -x +d
Compiler optimization options (compilation time, stripped binary size):
* No optimization (55 secs, 1 645 644):
Total Time: 0 hours 11 minutes 40 seconds (700 seconds)
* -Os (1 min 8 secs, 1 368 780):
Total Time: 0 hours 7 minutes 31 seconds (451 seconds)
* -O1 (1 min 2 secs, 1 436 716):
Total Time: 0 hours 7 minutes 8 seconds (428 seconds)
* -O2 (1 min 14 secs, 1 482 892):
Total Time: 0 hours 6 minutes 48 seconds (408 seconds)
* -O3 (1 min 22 secs, 1 665 612):
Total Time: 0 hours 6 minutes 43 seconds (403 seconds)
* -O3 -march=pentium4 (1 min 24 secs, 1 613 452):
Total Time: 0 hours 6 minutes 2 seconds (362 seconds)
* -O3 -march=pentium4 -ffast-math (1 min 23 secs, 1 577 068):
Total Time: 0 hours 5 minutes 24 seconds (324 seconds)
* -O3 -march=pentium4 -ffast-math -malign-double (1 min 22 secs, 1 576 780):
Total Time: 0 hours 5 minutes 25 seconds (325 seconds)
* -O3 -march=pentium4 -ffast-math -mfpmath=sse -msse2
(1 min 26 secs, 1 661 452):
Total Time: 0 hours 5 minutes 13 seconds (313 seconds)
* -O3 -march=pentium4 -ffast-math -mfpmath=sse -msse2 -minline-all-stringops
(1 min 23 secs, 1 662 284):
Total Time: 0 hours 5 minutes 18 seconds (318 seconds)
So the winner, in a pentium4, seems to be:
-O3 -march=pentium4 -ffast-math -mfpmath=sse -msse2
I noticed that the configure script of unix-pov3.6.1 did not add
the "-ffast-math" option to the Makefiles. This is worthy of notice.
Note also how -minline-all-stringops (which the configure script adds)
actually *slows* down the rendering a tiny bit.
--
- Warp
Post a reply to this message
|
![](/i/fill.gif) |