The underappreciated scaling of memory access costs hidden by the cult of Big-O

In any CS-101 textbook, you learn about the scaling of certain algorithms.  I’ve had a few conversations with several colleagues recently related to this topic, and how badly the “cult of complexity” can lead one astray when trying to scale real systems rather than ones in textbooks.  I wanted to write a bit about the topic, and share a simple program that demonstrates the realities.

This post sadly has nothing in particular to do with the giant fighting mecha anime, “Big O” from 1999… Actually, now that I think about it, scaling is bound to be a concern when making giant robots. But the square-cube law is a bit of a different topic than memory time complexity, so we’ll have to leave it for another day.

In the textbook, one algorithm might scale linearly — O(n).  Another might do better at O(log(n)), or have a constant number of operations regardless of working set size and be O(1), etc.  It’s a pretty straightforward way of keeping track of the terribleness of an algorithm.  But the key is that Big O notation only estimates the number of operations as n grows large, rather than than the amount of time taken to run it.  There is a pretty natural and intuitive implicit assumption that makes Big O useful.  The assumption is that one read operation from memory is about as slow as any other.  So you just need to count up the number of read operations to get an idea of how much time you’ll spend waiting on memory.

This is, of course, pants-on-fire nonsense.

Continue reading