CPU utilization is wrong

Everyone uses %CPU to measure performance, but everyone is wrong, says Netflix's Brendan Gregg in his UpSCALE Lightning Talk.
363 readers like this
363 readers like this
Perl tricks for system administrators

Opensource.com

CPU utilization is the metric everyone uses to measure a processor's performance. But %CPU is a misleading measure of how busy your processor really is, says Brendan Gregg, senior performance architect at Netflix, in what he calls a "five-minute public service announcement," at the 16th annual Southern California Linux Expo (SCALE).

In his Lightning Talk, "CPU Utilization is Wrong," Brendan explains what CPU utilization means—and doesn't mean—about performance and shares the open source tools he uses to identify reasons for bottlenecks and tune Netflix's systems. He also includes a mysterious case study that's relevant to everyone in 2018.

Watch Brendan's talk to learn how you can use Netflix's methods to determine what your CPUs are really doing to impact performance.

During the UpSCALE Lightning Talks hosted by Opensource.com at the 16th annual Southern California Linux Expo (SCALE) in March 2018, eight presenters shared quick takes on interesting open source topics, projects, and ideas. Watch all of the UpSCALE Lightning Talks on the Opensource.com YouTube channel.

Tags

Contributors

3 Comments

The way the Spectre/Meldown is handled in kernel fixes is stupid. It is meant to address a time attack but the issue is somewhere else: the precision of performance timers is too high and allows minotoring what happens in separate processes/VMs.
There's only a single solution: make the mission critical processes not reply as fast as possible (as permitted by the fast CPU), but add random response time that will mùake measurement impossible. This requires excellent randomizers, but very weak PRNG algorithms are used because our processors lack a device capable of providing fast enough bitrates of random entropy. Such device is available only in military-grade equipement. But given the scale of modern processors, and the speed they need to operate, there's LOT of very fast entropy sources to create secure RNG and deprecate the use of PRNGs: this is true because these processors are integrated at scales very near for the level of full random behavior (gates use now just very few atoms, there are lot of quantum physics for which lot of true random noise are generated that the hardware constantly tries to filter out: processors do not do anythin with the noise that is filtered out from numeric gates. It's time to think about the introduction of "IA" processors for neural networks: they don't need to do any binary decision to converge to a solution, just decisions based on difference of proabilities. Here comes the solution: quantum processing with neural networks that can be integrated at much higher scale than stupid binary gates. The noise captured by binary filters can be captured everywhere in a chip, amplified, and the nwill become excellent and very fast sources of entropy, allowing to secure mission-critical processes by erasing the predictable processing time differences. No more need to flush the TLB ! Still faster processing ! We have to build new algorithms that take non-binary decisions but probabilistic decisions and with variable time of processing (independantly of the final binary decision that will be taken).
Suc hthink is possible and notably within GPUs (that are now integrated in CPUs in single-chip "APUs"). It's time to think about the old Von Neumann processor model: it has lived and should die. We need new languages using quantum processing, neural networks: I'm sure they can even converge to a solution (any one, true or false) on their output in fast time which is randomized the same way independently of the solution.But for now, as long as we don't have quantum processing ni CPUs we need to really randomize the response time at all levels : in the software, on buses, on memory decoding and addressing. Let's forget about counting "cycles", what we really need is just an averaged speed with still lot of variability. This will make all time-based attacks ineffective: Goodbye Spectre/Meltdown and similar, doing that in the hardwre will simplify a lot the software design, but will not prohibit making even faster hardware. We can also go aboive the current limits of integration of binary gates: forget binary processing at the lowest level, we can do much more things and even produce excellent results with some noise and randomness in responses (all we need is a different way to converge to a solution: no dependency on the stabilization of a single binary gate, all gates are equlally competing to produce possibly in several very fast successive steps with feedback, a very suitable solution, and with even less energy spent in stabilizers and less heat to evacuate. We'll save lot of money too. There will be much less rejections when manufacuring dies: all dies even those with spots of defects will be usable and future computer will be operating with less strict conditions of envionmental temperature, will no longer need active cooling. On the "macro" view, we can still operate a binary decision ystem on top of it but its performance will no longer be mleasured in terms of fixed "cycles": forget the clocks as well!
Time to think about reintroducing analog processing and finaly abandon the binary gates ! And this time we'll use the best of what nature has produced: randomness, high variability, extreme resistant and recovery after failures/defects, and with very small levels of energy, and maximum parallelization everywhere (This is the magic of true "life": it is efficient and resistant and costs the minimum as long as we don't attempt to "standardize" its forms). We need an Earth with diversity and toi improve that diversity, and the same applies to computer processing. We should no longer depend on exactly reproducible conditions, only on averaged conditions which are statistically and probabilistically significant. That domain of study is just bootstrapping. IA, neural networks, will become more and more usable and will certainly outperform our existing binary decision systems that are extremely likely to fail or are easy to target by security attacks (which are now occuring at massive scales and very devastating). Rethink completely all concepts of computers !

Interesting blog post; too bad it's off topic.

In reply to by Philippe Verdy (not verified)

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.