I was watching this video and the part of Amdahl's Law got me thinking. It felt counterintuitive but it explains many benchmarks I've seen of multicore programs. In this blogpost I'll very briefly explain what it is and why it's implications are important (not to mention; very unfortunate) when you are attempting parallel programming.

The Maths

Not all algorithms can be written in parallel form. Algorithms often need some amount of time to sync between parts. One might be able to design a parallel map step, but you'll have idle CPUs during the reduce step. It turns out that even if your program cannot use all CPU's for a small amount of time that this has huge effects on resource utility.

The maths turns out to be relatively easy to derive. If you consider that $p$ percent of the time we're only effectively using 1 core and $(1-p)$ percent of the time $k$ cores can be used effectively then the speedup is defined by the following function;

$$ f(p, k) = \frac{\text{speedup 1 core}}{\text{speedup } k \text{ cores}} = \frac{1}{\frac{p}{1} + \frac{(1-p)}{k}} $$

The Intuition

The maths doesn't make everything intuitive though so to make it more intuitive though, let's chart this and play with the numbers.

  • a probability of of needing to sync [blue line] has an effectiveness of about /16, about /32 and about /64.
  • a probability of of needing to sync [red line] has an effectiveness of about /16, about /32 and about /64.

The first time I saw this it seemed counter intuitive; but the results are fairly drastic. Even if you have a 95% parallel program then you can still expect to have very poor resource utilisation. Cloud CPUs tend to scale linearly so it can be very hard to escape this problem by throwing more money at it. The effectiveness decreases asymptotically.

The Lesson

You'll probably have noticed the phenomenon before. When programming multicore programs the effectiveness of more CPUs becomes marginal as you add more and more of them. Instead of hoping for a speedup, it might be better to consinder costs instead. If you're willing to wait, it is much more effective to pay for a smaller machine.