Why %CPU is a misleading metric

Your friendly neighborhood ops troll is back!

Here’s a good article on why looking at %CPU (CPU utilization) is a misleading metric when using tools such as top. The normal inclination is to thing that a 90% utilization report means the CPU actually means it’s being used 90% of the time, right? You already know from the title I’m gonna tell you that’s wrong.

The tl;dr is that there is a bottleneck accessing main memory, which results in a lot of cycles that applications spend waiting on memory, falsely reporting that the CPU is “utilized”.

IPC (instructions per cycle) is a more accurate measurement of utilization and this article points to a few good examples of how to check that and tune apps that might be CPU-bound or memory-bound.



Don’t you mean instructions per second? Because instructions per cycle are probably maxed most of the time, since that number would be rather low, depending on the paralelization capabilities of the CPU. A simple risc processor would have something like 5 instruction per cycle while it may have millions of instructions per second. I’d say the instructions per cycle doesn’t really matter as long as you know the max number of instructions per second and the actual instructions per second, then again the max instructions per second is something that would never be achieved, except when you have a perfect set of threads running, each running machine code in such a way that the optimization has maximum potential. This is what I understood from processors, but I might be wrong entirely.
Perhaps the most useful measurement would be: instructions per cycle per second.


No, I actually do mean Instructions Per Cycle. IPC is measured as the number of instructions per tick that can be retired (on a 4 wide system, the max is 4). In his real-world sampling, he had an average of 0.7 IPC on a system capable of 4. Why would that be? Instructions aren’t getting retired because they’re waiting on something.


If I correctly understood the article it actually really measures IPC per second (In an example it measured per 10 secs) and then just gives you the average.
Or I’m wrong; then ignore me…

1 Like

The big problem is that the IPC unit you talk about is too advanced for the regular computer user to understand ¯\(ツ)

well they could just divide it by the max ipc and then do * 100%

Almost by definition, if you’re doing performance tuning on a production system, you’re not a regular user. You’re an advanced user and almost certainly a professional.

I do but when you have say 2 out of the max of 5 thats 0.4, times 100% (times 1 I know) = 40% in that sense.
That was a pretty bad necrobump btw :smirk: