POV-Ray: Newsgroups: povray.off-topic: Idle dreams: Re: Idle dreams

POV-Ray : Newsgroups : povray.off-topic : Idle dreams : Re: Idle dreams		Server Time 15 Oct 2025 15:35:25 EDT (-0400)

From: Invisible
Date: 28 Aug 2009 04:29:28
Message: <4a979568$1@news.povray.org>

>> And this is the other Fun Thing. Given enough CPU cores, you will 
>> eventually reach a point where the memory subsystem can't actually keep 
>> up. The result is that the more cores you have, the more time they can 
>> waste sitting idle.
> 
>   In theory you could still get an advantage if each core has its own
> L1 cache and they perform heavy calculations on bunches of data which
> fit those caches (and likewise the routine itself must obviously also
> fit in the L1 cache).
> 
>   In other words, if the algorithm can be constructed to be of the type
> "crunch 4 kB of data for several seconds, write the results to RAM and
> read new 4 kB of data, repeat", then additional cores will give an
> advantage.
> 
>   Of course many algorithms deal with a lot more data at a time than will
> nicely fit in L1 cache, especially if the cache is shared among the cores,
> so efficiency problems will start happening.

Typically the L1 cache is per-core, and the L2 cache is shared. But the 
point still stands: If two cores try to write to the same region of 
memory, they tend to constantly trip other each other with cache 
coherancy issues. Oh, and if your algorithm needs random access to a 
large block of RAM, forget it.

>   (Ironically, with shared L1 cache systems you might end up in situations
> where a single-threaded version of the algorithm actually runs faster than
> a multithreaded version, or where the multithreaded one doesn't run any
> faster than the single-threaded one.)

This is quite common. I've been reading documentation about Haskell's GC 
engine. They found that when running in parallel, sometimes it's faster 
to turn *off* load-balancing, because that way each GC thread is 
processing data which already happens to be in the cache, so that's 
faster. If you migrate the work to another core, the data has to be 
pumped out of one cache, into memory, and into the other cache before it 
can be processed.

If CPUs didn't need caches in the first place (i.e., RAM was faster than 
the CPU) then this would be a total non-issue. But here in the real 
world, it's sometimes faster to leave cores idling rather than risk 
upsetting the Almighty Cache. How sad...

Post a reply to this message