POV-Ray: Newsgroups: povray.off-topic: Idle dreams

POV-Ray : Newsgroups : povray.off-topic : Idle dreams		Server Time 30 Oct 2025 09:16:21 EDT (-0400)

<<< Previous 10 Messages

Goto Latest 10 Messages

Next 10 Messages >>>

From: Orchid XP v8
Subject: Re: Idle dreams
Date: 27 Aug 2009 17:21:42
Message: <4a96f8e6$1@news.povray.org>

Warp wrote:

>   Fast enough, in most cases. I think memory and I/O speed will be the
> bottleneck at some point, after which the extra cores will be worth a
> paperweight.
> 
>   (It's surprising how much effect memory bus speed has eg. on video
> capturing.)

And this is the other Fun Thing. Given enough CPU cores, you will 
eventually reach a point where the memory subsystem can't actually keep 
up. The result is that the more cores you have, the more time they can 
waste sitting idle.

(As somebody once said, "a supercomputer is a device for turning 
compute-bound problems into I/O-bound problems". It seems apt here.)

Long ago when RAM was actually faster than the CPU, having several of 
them seemed like a good idea. Currently the CPU goes way, way faster 
than RAM anyway. Having "more CPU" just makes the problem worse...

-- 
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*

Post a reply to this message

From: Warp
Subject: Re: Idle dreams
Date: 27 Aug 2009 17:36:02
Message: <4a96fc42@news.povray.org>

Orchid XP v8 <voi### [at] devnull> wrote:
> And this is the other Fun Thing. Given enough CPU cores, you will 
> eventually reach a point where the memory subsystem can't actually keep 
> up. The result is that the more cores you have, the more time they can 
> waste sitting idle.

  In theory you could still get an advantage if each core has its own
L1 cache and they perform heavy calculations on bunches of data which
fit those caches (and likewise the routine itself must obviously also
fit in the L1 cache).

  In other words, if the algorithm can be constructed to be of the type
"crunch 4 kB of data for several seconds, write the results to RAM and
read new 4 kB of data, repeat", then additional cores will give an
advantage.

  Of course many algorithms deal with a lot more data at a time than will
nicely fit in L1 cache, especially if the cache is shared among the cores,
so efficiency problems will start happening.

  (Ironically, with shared L1 cache systems you might end up in situations
where a single-threaded version of the algorithm actually runs faster than
a multithreaded version, or where the multithreaded one doesn't run any
faster than the single-threaded one.)

-- 
                                                          - Warp

Post a reply to this message

From: clipka
Subject: Re: Idle dreams
Date: 27 Aug 2009 20:50:00
Message: <4a9729b8$1@news.povray.org>

Warp schrieb:
>   The main reason is that the GPU sets the limit, not the CPU. If the CPU
> gets fast enough, it will just sit idle while the GPU renders a frame.
> Adding more cores is not going to help that.

That's only part of the story.

The other part is "AI", physics sim and stuff like that. If you've maxed 
out the GPU and have no way to add visual incentives to buy, you can 
start focusing on making the gameplay more complex.

If the gaming industry won't find any other way to keep additional CPU 
cores busy, they'll go on smartening the AIs, adding more physics 
effects, getting rid of level transitions, and what-have-you-not.

And maybe ultimately they might even start inventing innovative game 
concepts ;-)

Post a reply to this message

From: scott
Subject: Re: Idle dreams
Date: 28 Aug 2009 03:37:48
Message: <4a97894c$1@news.povray.org>

>  The main reason is that the GPU sets the limit, not the CPU. If the CPU
> gets fast enough, it will just sit idle while the GPU renders a frame.
> Adding more cores is not going to help that.

More realistic physics and AI!

Post a reply to this message

From: Invisible
Subject: Re: Idle dreams
Date: 28 Aug 2009 04:20:45
Message: <4a97935d$1@news.povray.org>

>> But, for reasons unknown, desktop motherboards never support multiple 
>> CPUs...
> 
> I dunno. I go down to Frye's Electronics and they have a bunch. They're 
> all hundreds of dollars more than the single-CPU boards...

Yeah, I imagine it's pretty expensive to wire up several hundred extra 
tracks on the board. SLI boards are all way more expensive too. But at 
least shops *sell* those...

Post a reply to this message

From: Invisible
Subject: Re: Idle dreams
Date: 28 Aug 2009 04:22:53
Message: <4a9793dd$1@news.povray.org>

Darren New wrote:

> My boss bought a quad-core Mac with SSDs. A few weeks later I asked how 
> it worked out.  He said "I never wait for anything." :-)

And that's the difference. With Windoze, just closing the CD drive is 
enough to lock the entire Explorer shell for ten minutes while it 
attempts to determine whether there's a disk in there. (Um, 
multitasking? Anyone?)

> With several SATA drives, it's nice to be able to copy at 80MBps between 
> two different pairs of drives at once.  Way nicer than lame-ass IDE.

My PC at home is all SATA too. It still takes forever for TF2 to start 
up. :-P

> And my net here is nicely peppy. At 12Mbps, I'm almost always maxing out 
> someone else's connection before my own. I can suck down an entire CD 
> worth of data in about 15 minutes, faster than driving into work to pick 
> it up.

Where in the name of God can you get 12 Mbit/sec? I thought 8 was the 
maximum that ADSL supports...

Post a reply to this message

From: Invisible
Subject: Re: Idle dreams
Date: 28 Aug 2009 04:29:28
Message: <4a979568$1@news.povray.org>

>> And this is the other Fun Thing. Given enough CPU cores, you will 
>> eventually reach a point where the memory subsystem can't actually keep 
>> up. The result is that the more cores you have, the more time they can 
>> waste sitting idle.
> 
>   In theory you could still get an advantage if each core has its own
> L1 cache and they perform heavy calculations on bunches of data which
> fit those caches (and likewise the routine itself must obviously also
> fit in the L1 cache).
> 
>   In other words, if the algorithm can be constructed to be of the type
> "crunch 4 kB of data for several seconds, write the results to RAM and
> read new 4 kB of data, repeat", then additional cores will give an
> advantage.
> 
>   Of course many algorithms deal with a lot more data at a time than will
> nicely fit in L1 cache, especially if the cache is shared among the cores,
> so efficiency problems will start happening.

Typically the L1 cache is per-core, and the L2 cache is shared. But the 
point still stands: If two cores try to write to the same region of 
memory, they tend to constantly trip other each other with cache 
coherancy issues. Oh, and if your algorithm needs random access to a 
large block of RAM, forget it.

>   (Ironically, with shared L1 cache systems you might end up in situations
> where a single-threaded version of the algorithm actually runs faster than
> a multithreaded version, or where the multithreaded one doesn't run any
> faster than the single-threaded one.)

This is quite common. I've been reading documentation about Haskell's GC 
engine. They found that when running in parallel, sometimes it's faster 
to turn *off* load-balancing, because that way each GC thread is 
processing data which already happens to be in the cache, so that's 
faster. If you migrate the work to another core, the data has to be 
pumped out of one cache, into memory, and into the other cache before it 
can be processed.

If CPUs didn't need caches in the first place (i.e., RAM was faster than 
the CPU) then this would be a total non-issue. But here in the real 
world, it's sometimes faster to leave cores idling rather than risk 
upsetting the Almighty Cache. How sad...

Post a reply to this message

From: scott
Subject: Re: Idle dreams
Date: 28 Aug 2009 04:45:37
Message: <4a979931$1@news.povray.org>

> If CPUs didn't need caches in the first place (i.e., RAM was faster than 
> the CPU) then this would be a total non-issue. But here in the real world, 
> it's sometimes faster to leave cores idling rather than risk upsetting the 
> Almighty Cache. How sad...

Yes, if we upset the Almight Cache, *shock* we might drop back to the 
performance levels of the fastest RAM available.  The cache is there to 
*speed up* stuff, I have no idea why you'd want a machine with a CPU running 
at the same speed as the fastest RAM available, you're then going to get the 
same levels of performance as if you upset the cache continuously!

Post a reply to this message

From: Invisible
Subject: Re: Idle dreams
Date: 28 Aug 2009 04:51:14
Message: <4a979a82$1@news.povray.org>

scott wrote:
>> If CPUs didn't need caches in the first place (i.e., RAM was faster 
>> than the CPU) then this would be a total non-issue. But here in the 
>> real world, it's sometimes faster to leave cores idling rather than 
>> risk upsetting the Almighty Cache. How sad...
> 
> Yes, if we upset the Almight Cache, *shock* we might drop back to the 
> performance levels of the fastest RAM available.

I meant that it's sad that we don't have RAM that can perform as fast as 
the CPU itself.

[Or rather... I guess we do, since that must be what they make the L1 
cache out of. But the L1 cache is tiny, so...]

Post a reply to this message

From: scott
Subject: Re: Idle dreams
Date: 28 Aug 2009 05:16:06
Message: <4a97a056@news.povray.org>

> I meant that it's sad that we don't have RAM that can perform as fast as 
> the CPU itself.
>
> [Or rather... I guess we do, since that must be what they make the L1 
> cache out of. But the L1 cache is tiny, so...]

And, most importantly, it is very close to the CPU.

It's a cost/benefit thing, for $X how do you make the fastest computer?  The 
answer is to have a big slab of slow RAM, and progressively smaller bits of 
faster RAM.  Trying to do it another way will not make the fastest machine 
for a given amount of money.

Post a reply to this message

<<< Previous 10 Messages

Goto Latest 10 Messages

Next 10 Messages >>>