|
 |
>> And this is the other Fun Thing. Given enough CPU cores, you will
>> eventually reach a point where the memory subsystem can't actually keep
>> up. The result is that the more cores you have, the more time they can
>> waste sitting idle.
>
> In theory you could still get an advantage if each core has its own
> L1 cache and they perform heavy calculations on bunches of data which
> fit those caches (and likewise the routine itself must obviously also
> fit in the L1 cache).
>
> In other words, if the algorithm can be constructed to be of the type
> "crunch 4 kB of data for several seconds, write the results to RAM and
> read new 4 kB of data, repeat", then additional cores will give an
> advantage.
>
> Of course many algorithms deal with a lot more data at a time than will
> nicely fit in L1 cache, especially if the cache is shared among the cores,
> so efficiency problems will start happening.
Typically the L1 cache is per-core, and the L2 cache is shared. But the
point still stands: If two cores try to write to the same region of
memory, they tend to constantly trip other each other with cache
coherancy issues. Oh, and if your algorithm needs random access to a
large block of RAM, forget it.
> (Ironically, with shared L1 cache systems you might end up in situations
> where a single-threaded version of the algorithm actually runs faster than
> a multithreaded version, or where the multithreaded one doesn't run any
> faster than the single-threaded one.)
This is quite common. I've been reading documentation about Haskell's GC
engine. They found that when running in parallel, sometimes it's faster
to turn *off* load-balancing, because that way each GC thread is
processing data which already happens to be in the cache, so that's
faster. If you migrate the work to another core, the data has to be
pumped out of one cache, into memory, and into the other cache before it
can be processed.
If CPUs didn't need caches in the first place (i.e., RAM was faster than
the CPU) then this would be a total non-issue. But here in the real
world, it's sometimes faster to leave cores idling rather than risk
upsetting the Almighty Cache. How sad...
Post a reply to this message
|
 |