The more CPU’s you add, the worse it gets.
I was watching this video and the part of Amdahl’s Law got me thinking. It felt counterintuitive but it explains many benchmarks I’ve seen of multicore programs. In this blogpost I’ll very briefly explain what it is and why it’s implications are important (not to mention; very unfortunate) when you are attempting parallel programming.
Not all algorithms can be written in parallel form. Algorithms often need some amount of time to sync between parts. One might be able to design a parallel map
step, but you’ll have idle CPUs during the reduce
step. It turns out that even if your program cannot use all CPU’s for a small amount of time that this has huge effects on resource utility.
The maths turns out to be relatively easy to derive. If you consider that \(p\) percent of the time we’re only effectively using 1 core and \((1-p)\) percent of the time \(k\) cores can be used effectively then the speedup is defined by the following function;
\[ f(p, k) = \frac{\text{speedup 1 core}}{\text{speedup } k \text{ cores}} = \frac{1}{\frac{p}{1} + \frac{(1-p)}{k}} \]
The maths doesn’t make everything intuitive though so to make it more intuitive though, let’s chart this and play with the numbers.
The first time I saw this it seemed counter intuitive; but the results are fairly drastic. Even if you have a 95% parallel program then you can still expect to have very poor resource utilisation. Cloud CPUs tend to scale linearly so it can be very hard to escape this problem by throwing more money at it. The effectiveness decreases asymptotically.
You’ll probably have noticed the phenomenon before. When programming multicore programs the effectiveness of more CPUs becomes marginal as you add more and more of them. Instead of hoping for a speedup, it might be better to consinder costs instead. If you’re willing to wait, it is much more effective to pay for a smaller machine.