Sunday, February 25, 2007

Performance Visualization

Perf viz for computers (PVC) is both an abiding interest and a pet peeve of mine. I first wrote about this topic in a 1992 paper entitled: "On the Application of Barycentric Coordinates to the Prompt and Visually Efficient Display of Multiprocessor Performance Data." Along with that paper, I also produced what IMHO is an interesting example of an efficient performance tool for the visualization of mulitprocessor CPU utilization called barry (for barycentric or triangular coordinates---get it?), written in C using the ncurses library.

RESEARCH: My role models for PVC are scientific visualization and statistical data visualization. But why should physicists and statisticians have all the fun!? I maintain that similar CHI techniques could be applied to help people do a better job of performancs analysis and capacity planning. I consider a central goal to be to find visual paradigms that offer the best impedance match b/w the digital-electronic computer under test and the cognitive computer doing the analysis (aka your brain). This is a very deep problem because we have a relatively poor understanding of how our visual system works (both the optical and the neural circuits), although this is improving all the time. So it's very difficult to quantify "best match" in general, and even more difficult to quantify it for individuals. One thing we do know is, that vision is based on neural computation; which also explains why the moon looks bigger on the horizion than when you when take a photograph of it in that same location. (How then was the following shot taken?)

Michal Migurski has a nice blog entry on this topic which supports some remarks I made in a 2005 CMG Measure-IT article. I discovered that the company he works for also did this novel visualization for blog backtracking. Is it too much to ask for this kind of creativity to be applied to performance data?

An important goal for this subject is to try and define a set of principles for PVC. In this sense, I do not find anything especially useful in the work of Edward Tufte whose approach seems to be based more on art and aethestics than science. When I look at his visual renderings, I can conclude that they are a good choice, but how do I do it for myself? Tufte never tells us. For my money, I much prefer the scientific work of John Tukey (of FFT fame). His Box and Whisker representation is based on the principle of needing at least 5 indicators to distinguish between data distributions.

PEEVE: As far as I can tell, nothing is being done by performance tool vendors to enable better PVC because they don't want to spend the money on development (even if they did understand the technicall issues), and the end-users (that would probably be you) don't know that the visual interfaces for the current crop of tools could be vastly improved. In other words: Nothing demanded, nothing new. As a small effort to kick-start PVC at CMG 2007 in San Diego, Mario Jauvin and I are thinking of submitting a paper on this topic. We'll see!


Giordano Beretta said...

Yes, visualization of multidimensional data is difficult and requires a lot of experience with the data to visualize AND in descriptive statistics. I like the suggestion in the linked CMG Measure-IT article to learn from the evolution of the human's visual system.

steve jenkin said...

John Mashey used to quote the input bandwidth of the human vision system around 10 or 100Mbps... Our visual cortexs are impressive - and much underused/undervalued in general I.T.

One of my biggest surprises from running a high-profile website was giving the managers access to system performance graphs (rrdtool) - and having them use it, a lot.

Even though they didn't know exactly what the figures 'meant' and couldn't manage the site, they learnt pretty quickly what was 'good' and were able to see traffic patterns and have a 'feel' for their site.

They felt more in-control and better able to manage their mission. More so, because the system as originally delivered gave them nothing - and one day died in spectacular fashion.

They'd had no inkling it could be coming, it wasn't even a blip in their Risk Management meetings.

Visualisations may be helpful for analysis, tuning and even fault diagnosis - but their usefulness is much wider, even to untrained non-I.T. users.

Expect the good ones to get used more widely than you might imagine.