Sunday, February 25, 2007
Perf viz for computers (PVC) is both an abiding interest and a pet peeve of mine. I first wrote about this topic in a 1992 paper entitled: "On the Application of Barycentric Coordinates to the Prompt and Visually Efficient Display of Multiprocessor Performance Data." Along with that paper, I also produced what IMHO is an interesting example of an efficient performance tool for the visualization of mulitprocessor CPU utilization called barry (for barycentric or triangular coordinates---get it?), written in C using the ncurses library.
RESEARCH: My role models for PVC are scientific visualization and statistical data visualization. But why should physicists and statisticians have all the fun!? I maintain that similar CHI techniques could be applied to help people do a better job of performancs analysis and capacity planning. I consider a central goal to be to find visual paradigms that offer the best impedance match b/w the digital-electronic computer under test and the cognitive computer doing the analysis (aka your brain). This is a very deep problem because we have a relatively poor understanding of how our visual system works (both the optical and the neural circuits), although this is improving all the time. So it's very difficult to quantify "best match" in general, and even more difficult to quantify it for individuals. One thing we do know is, that vision is based on neural computation; which also explains why the moon looks bigger on the horizion than when you when take a photograph of it in that same location. (How then was the following shot taken?)
Michal Migurski has a nice blog entry on this topic which supports some remarks I made in a 2005 CMG Measure-IT article. I discovered that the company he works for also did this novel visualization for blog backtracking. Is it too much to ask for this kind of creativity to be applied to performance data?
An important goal for this subject is to try and define a set of principles for PVC. In this sense, I do not find anything especially useful in the work of Edward Tufte whose approach seems to be based more on art and aethestics than science. When I look at his visual renderings, I can conclude that they are a good choice, but how do I do it for myself? Tufte never tells us. For my money, I much prefer the scientific work of John Tukey (of FFT fame). His Box and Whisker representation is based on the principle of needing at least 5 indicators to distinguish between data distributions.
PEEVE: As far as I can tell, nothing is being done by performance tool vendors to enable better PVC because they don't want to spend the money on development (even if they did understand the technicall issues), and the end-users (that would probably be you) don't know that the visual interfaces for the current crop of tools could be vastly improved. In other words: Nothing demanded, nothing new. As a small effort to kick-start PVC at CMG 2007 in San Diego, Mario Jauvin and I are thinking of submitting a paper on this topic. We'll see!