POV-Ray : Newsgroups : povray.general : Requesting ideas/opinions for RNG seeding syntax Server Time
30 Jul 2024 22:28:21 EDT (-0400)
  Requesting ideas/opinions for RNG seeding syntax (Message 61 to 70 of 106)  
<<< Previous 10 Messages Goto Latest 10 Messages Next 10 Messages >>>
From: "Jérôme M. Berger"
Subject: Re: Requesting ideas/opinions for RNG seeding syntax
Date: 24 May 2009 04:20:47
Message: <4a19035f$1@news.povray.org>
Warp wrote:
> "J�r�me M. Berger" <jeb### [at] freefr> wrote:
>>         By simple curiosity, I've also added Warp's implementation of 
Isaac 
>> and here is the result (see attached code):
> 
>>  > g++ -O3 -lm -o random random.cc IsaacRand.cc
> 
>   I suggest adding -march=native in order for gcc to fully optimize i
t for
> your platform. It might make a big difference, especially with this typ
e
> of code.
	It doesn't change anything here. Remember that the gcc shipped with 
most 32-bits linux distros is configured to generate code for the 
original Pentium! In this case, forcing the compiler to optimize for 
your machine is usually a big win. On 64-bits platforms however, 
there is not that much difference between the first generation 
Athlon64 and the current generation.

>   (Also -lm is probably not needed. At least I don't need it here.)
> 
	I originally didn't put it, but I added it because it was needed 
for the C version. Now that the code is compiled as C++, you're 
right it's not needed any more.

>>  > ./random
>> Empty loop:          0ms
>> Kiss64 (int):     6623ms
>> Kiss64 (dbl):     5003ms
>> Alvo (floor):    21539ms
>> Alvo (cast):     14608ms
>> Alvo (tmp+cast): 14664ms
>> Isaac:           10540ms
> 
>   I tried running the program on my Pentium4 (with g++ -O3 -march=nat
ive)
> and got the following results:
> 
> Empty loop:          0ms
> Kiss64 (int):     2669ms
> Kiss64 (dbl):     2643ms
> Alvo (floor):     2098ms
> Alvo (cast):      1767ms
> Alvo (tmp+cast):  1766ms
> Isaac:             690ms
> 
>   The speed seems to be pretty dependant on the CPU type being used...
> 
	So it seems...

>   Out of curiosity, I also tried adding -mfpmath=sse to the compiler 
options,
> and the results changed a bit:
> 
> Empty loop:          0ms
> Kiss64 (int):     2687ms
> Kiss64 (dbl):     2640ms
> Alvo (floor):     2424ms
> Alvo (cast):      1294ms
> Alvo (tmp+cast):  1335ms
> Isaac:             685ms
> 


-- 
mailto:jeb### [at] freefr
http://jeberger.free.fr
Jabber: jeb### [at] jabberfr


Post a reply to this message


Attachments:
Download 'us-ascii' (1 KB)

From: "Jérôme M. Berger"
Subject: Re: Requesting ideas/opinions for RNG seeding syntax
Date: 24 May 2009 05:01:38
Message: <4a190cf2$1@news.povray.org>
Warp wrote:
> "J�r�me M. Berger" <jeb### [at] freefr> wrote:
>> So here are the timings with no optimization (-O0):
> 
>   I think comparing speed of unoptimized code is rather absurd and
> useless. :)
> 
	In a way, that's true. However, it's not as useless as you imply. 
For example, in hindsight it is now obvious to me that the compiler 
removed the floating point division from Kiss64 (dbl), which 
explains why it was as fast as the integer version. So I've attached 
a new version of the benchmark that's specifically designed to foil 
these kind of optimizations that wouldn't happen in real life. And 
here are the timings (with -O3):

Empty loop:       3387ms	(was     0ms)
Kiss64 (int):     7348ms	(was  6623ms)
Kiss64 (dbl):    16045ms	(was  5003ms)
Alvo (floor):    22209ms	(was 21539ms)
Alvo (cast):     16258ms	(was 14608ms)
Alvo (tmp+cast): 15054ms	(was 14664ms)
Isaac:           10701ms	(was 10540ms)

	Notice how all tests except Isaac now take longer. I believe that 
the reason why Isaac isn't affected is because it calls a function 
that's in another file, which prevented gcc from optimizing as 
aggressively.

	This just goes to show that designing benchmarks is not as easy as 
it appears...

		Jerome
-- 
mailto:jeb### [at] freefr
http://jeberger.free.fr
Jabber: jeb### [at] jabberfr


Post a reply to this message


Attachments:
Download 'us-ascii' (3 KB)

From: "Jérôme M. Berger"
Subject: Re: Requesting ideas/opinions for RNG seeding syntax
Date: 24 May 2009 05:10:51
Message: <4a190f1b$1@news.povray.org>
Jérôme M. Berger wrote:
> Empty loop:       3387ms    (was     0ms)
> Kiss64 (int):     7348ms    (was  6623ms)
> Kiss64 (dbl):    16045ms    (was  5003ms)
> Alvo (floor):    22209ms    (was 21539ms)
> Alvo (cast):     16258ms    (was 14608ms)
> Alvo (tmp+cast): 15054ms    (was 14664ms)
> Isaac:           10701ms    (was 10540ms)
> 

	I just ran the tests on another computer with a Core2 duo and the 
results are rather interesting:

	With the same binary as above:
Empty loop:       2173ms
Kiss64 (int):     7541ms
Kiss64 (dbl):    15844ms
Alvo (floor):    24778ms
Alvo (cast):      8469ms
Alvo (tmp+cast):  8473ms
Isaac:            7810ms

	Notice how the last three are much faster?

	Recompiled with g++ 4.4 and -O3:
Empty loop:       1154ms
Kiss64 (int):     5944ms
Kiss64 (dbl):    12590ms
Alvo (floor):    22433ms
Alvo (cast):      7636ms
Alvo (tmp+cast):  8076ms
Isaac:            6428ms

	Slight improvement (10-20%) all around.

	Recompiled with g++ 4.4 and -O3 -ftree-vectorize:
Empty loop:       1085ms
Kiss64 (int):     5504ms
Kiss64 (dbl):    11663ms
Alvo (floor):    22039ms
Alvo (cast):      7730ms
Alvo (tmp+cast):  7730ms
Isaac:            5918ms

	Again, slight improvement all around except for Alvo. I'm not sure 
why Kiss64 gets improved though...

	And here is the system info:
 > uname -a
Linux rover 2.6.29-ARCH #1 SMP PREEMPT Sat May 9 14:09:36 CEST 2009 
x86_64 Intel(R) Core(TM)2 Duo CPU T5670 @ 1.80GHz GenuineIntel GNU/Linux

  > gcc --version
gcc (GCC) 4.4.0
Copyright (C) 2009 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There 
is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR 
PURPOSE.

		Jerome
-- 
mailto:jeb### [at] freefr
http://jeberger.free.fr
Jabber: jeb### [at] jabberfr


Post a reply to this message


Attachments:
Download 'us-ascii' (1 KB)

From: Warp
Subject: Re: Requesting ideas/opinions for RNG seeding syntax
Date: 24 May 2009 06:08:11
Message: <4a191c8b@news.povray.org>

>         I just ran the tests on another computer with a Core2 duo and the 
> results are rather interesting:

>         With the same binary as above:
> Empty loop:       2173ms
> Kiss64 (int):     7541ms
> Kiss64 (dbl):    15844ms
> Alvo (floor):    24778ms
> Alvo (cast):      8469ms
> Alvo (tmp+cast):  8473ms
> Isaac:            7810ms

  One thing I can't understand: Why are the results *significantly* faster
in my Pentium4 (eg. the Isaac RNG seems to be 10 times faster, which is
a HUGE difference in speed), even though a Core2 duo should be faster
than a Pentium4? Something does not compute here.

-- 
                                                          - Warp


Post a reply to this message

From: clipka
Subject: Re: Requesting ideas/opinions for RNG seeding syntax
Date: 24 May 2009 08:30:00
Message: <web.4a193cee38187d7e819d05910@news.povray.org>
Warp <war### [at] tagpovrayorg> wrote:
> http://warp.povusers.org/IsaacRand.zip

Just out of curiosity: Is that the same RNG that lives in MCPov?

I think I read something about you having supplied some RNG know-how or even
code to the montecarlo renderer.


Post a reply to this message

From: Warp
Subject: Re: Requesting ideas/opinions for RNG seeding syntax
Date: 24 May 2009 08:32:00
Message: <4a193e40@news.povray.org>
clipka <nomail@nomail> wrote:
> Warp <war### [at] tagpovrayorg> wrote:
> > http://warp.povusers.org/IsaacRand.zip

> Just out of curiosity: Is that the same RNG that lives in MCPov?

> I think I read something about you having supplied some RNG know-how or even
> code to the montecarlo renderer.

  Yes, I think the author used it when I suggested that he could use
something of higher quality than std::rand().

-- 
                                                          - Warp


Post a reply to this message

From: clipka
Subject: Re: Requesting ideas/opinions for RNG seeding syntax
Date: 24 May 2009 09:00:00
Message: <web.4a1944b138187d7e819d05910@news.povray.org>
Warp <war### [at] tagpovrayorg> wrote:

> >         I just ran the tests on another computer with a Core2 duo and the
> > results are rather interesting:
>
> >         With the same binary as above:
> > Empty loop:       2173ms
> > Kiss64 (int):     7541ms
> > Kiss64 (dbl):    15844ms
> > Alvo (floor):    24778ms
> > Alvo (cast):      8469ms
> > Alvo (tmp+cast):  8473ms
> > Isaac:            7810ms
>
>   One thing I can't understand: Why are the results *significantly* faster
> in my Pentium4 (eg. the Isaac RNG seems to be 10 times faster, which is
> a HUGE difference in speed), even though a Core2 duo should be faster
> than a Pentium4? Something does not compute here.

What's the processor clock of your P4?
What's the typical processor clock of a modern multi-core CPU?

My Windows machine is an old, darn slow 3.4 GHz P4.
My Linux machine is a new, speedy 2.3 GHz AMD Phenom X4.

The modern Multi-Core CPUs win mainly due to...

- optimizations regarding caches, pipelining etc.
- providing multiple cores to run programs on

With a PRNG test suite, however, none of these bring any advantage. There are no
difficult-to-predict branch instructions; virtually no additional data to fetch
into the caches; no parallel threads to distribute the workload; all the CPU
has to do is to braindeadly execute one single thread of pure arithmetic
instructions.

I'm not too much surprised that in this case the bottleneck is sheer GHz power -
the only thing that hasn't been improved at all since the P4 (to the contrary!).


Post a reply to this message

From: clipka
Subject: Re: Requesting ideas/opinions for RNG seeding syntax
Date: 24 May 2009 09:10:00
Message: <web.4a19469838187d7e819d05910@news.povray.org>
Warp <war### [at] tagpovrayorg> wrote:
>   Yes, I think the author used it when I suggested that he could use
> something of higher quality than std::rand().

So one can say that it has been well-proven by now in a POV-Ray context (albeit
not as a SDL element).


Post a reply to this message

From: Warp
Subject: Re: Requesting ideas/opinions for RNG seeding syntax
Date: 24 May 2009 09:46:05
Message: <4a194f9d@news.povray.org>
clipka <nomail@nomail> wrote:
> What's the processor clock of your P4?

  3.4 GHz.

> What's the typical processor clock of a modern multi-core CPU?

  Certainly not 340 MHz.

> The modern Multi-Core CPUs win mainly due to...

> - optimizations regarding caches, pipelining etc.
> - providing multiple cores to run programs on

> With a PRNG test suite, however, none of these bring any advantage. There are no
> difficult-to-predict branch instructions; virtually no additional data to fetch
> into the caches; no parallel threads to distribute the workload; all the CPU
> has to do is to braindeadly execute one single thread of pure arithmetic
> instructions.

  That doesn't really explain an entire order of magnitude of difference in
speed. This is especially so because the Pentium4 line is notoriously crappy
at things like bit shifting (because they removed the fast shift barrel from
the P4 line for whatever reason, and reintroduced it later in the Core line,
AFAIK).

  If a Core2 Duo runs simple integer operations at 1/10th the speed of
my Pentium4, I really don't want a Core2 Duo.

  However, I still suspect there's something else going on here.

> I'm not too much surprised that in this case the bottleneck is sheer GHz power -
> the only thing that hasn't been improved at all since the P4 (to the contrary!).

  I would find it rather disappointing if processors had gone 10 years
backwards in integer speed (and in fact floating point speed as well,
because the floating point tests were also significantly faster in my
computer).

-- 
                                                          - Warp


Post a reply to this message

From: "Jérôme M. Berger"
Subject: Re: Requesting ideas/opinions for RNG seeding syntax
Date: 24 May 2009 10:55:53
Message: <4a195ff9$1@news.povray.org>
Warp wrote:
> clipka <nomail@nomail> wrote:
>> What's the processor clock of your P4?
> 
>   3.4 GHz.
> 
	The Core2 I used for these tests runs at 1.8GHz. So we already have 
a factor of two here.

	What version of gcc are you using? If it's more recent than 4.2.3, 
it might be optimizing some computations away. You should retry with 
the latest source code I posted (at 11am French time), which is the 
one I used on the Core2. I'll try the old code on the Core2 later 
this evening when it's cooler (currently at 27°C and rising, I don't
 
want to start my laptop in these conditions...)

		Jerome
-- 
mailto:jeb### [at] freefr
http://jeberger.free.fr
Jabber: jeb### [at] jabberfr


Post a reply to this message


Attachments:
Download 'us-ascii' (1 KB)

<<< Previous 10 Messages Goto Latest 10 Messages Next 10 Messages >>>

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.