The Pith of Performance: February 2007

Tuesday, February 27, 2007

Moore's Law II: More or Less?

For the past few years, Intel, AMD, IBM, Sun, et al., have been promoting the concept of multicores i.e., no more single CPUs. A month ago, however, Intel and IBM made a joint announcement that they will produce single CPU parts using 45 nanometer (nm) technology. Intel says it is converting all its fab lines and will produce 45 nm parts (code named "penryn") by the end of this year. What's going on here?

We fell off the Moore's law curve, not because photolithography collided with limitations due to quantum physics or anything else exotic, but more mudanely because it ran into a largely unanticipated thermodynamic barrier. In other words, Moore's law was stopped dead in its tracks by old-fashioned 19th century physics.

CMG 2007 Hot Topics Call

I am the Session Area Chairperson for Hot Topics at CMG 2007 this year. Proposed topics include, but are not limited to: SOA, Web Services, Virtualization, RFID, Server Consolidation, Gaming performance, Blade Servers, Grid Computing, Clustering, Performance visualization, and Emerging Technologies.

If you have a hot topic you'd like to present or know of someone else that might, please let me know about it either by posting here or contacting me via email. Thank you.

Monday, February 26, 2007

Helping Amazon's Mechanical Turk Search for Jim Gray's Yacht

Regrettably, Jim Gray is still missing, but I thought Amazon.com deserved more kudos than they got in the press for their extraordinary effort to help in the search for Gray's yacht. Google got a lot of press coverage for talking up the idea of using satellite image sources, but Amazon did it. Why is that? One reason is that Amazon has a lot of experience operating and maintaining very large distributed databases (VLDBs). Another reason is that it's not just Google that has been developing interesting Internet tools. Amazon (somewhat quitely, by comparison) has also developed their own Internet tools, like the Mechanical Turk. These two strengths combined at Amazon and enabled them to load a huge number of satellite images of the Pacific into the Turk database, thereby facilitating anyone (many eyes) to scan them via the Turk interface, and all that on very short order. Jim would be impressed.

I spent several hours on Sunday, Feb 4th, using Amazon's Mechanical Turk to help look for Gray's yacht. The images above (here about one quarter the size displayed by the Turk) show one example where I thought there might have been an interesting object; possibly a yacht. Image A is captured by the satellite at a short time before image B (t₁ < t₂). You can think of the satellite as sweeping down this page. Things like whitecaps on the ocean surface are going to tend dissipate and thus change pixels between successive frames, whereas a solid object like a ship will tend to remain invariant. The red circle (which I added) marks such a stable set of pixels which also have approximately the correct dimensions for a yacht i.e., about 10 pixels long (as explained by Amazon). Unfortunately, what appears to be an interesting object here has not led to the discovery of Gray's whereabouts.

Use of the Turk satellite images was hampered by a lack of any way to reference the images (about 5 per web page) by number, and there was no coordinate system within each image to express the location of any interesting objects. These limitations could have led to ambiguities in the follow up human expert search of flagged images. However, given that time was of the essence for any possible rescue effort, omitting these niceties was a completely understandable decision.

Sunday, February 25, 2007

PDQ e-Biz Code

The complete Perl code for the 3-tier e-commerce example described in Chapter 10 of my Perl::PDQ book is as follows.

ITIL and Beyond in 2008

ITIL for Guerrillas is the title of Chapter 2 in my new book Guerrilla Capacity Planning (GCaP). That chapter attempts to give some perspective on where GCaP methodologies fit into the ITIL framework.

I am thinking about presenting a new Guerrilla class on this topic for 2008, which would go well beyond the material in chapter 2 to compare what ITIL offers with what is actually needed to provide proper IT service and capacity planning. I'm working with Steve Jenkin, an IT consultant in Australia, who is currently being ITIL certified. Check out his blog ITIL Utopia - Or Not? as he develops his own unique perspective on ITIL.

Please post your comments and suggestions here so we can get some idea of the level of interest in this topic (possible title: Going Guerrilla on ITIL). Assuming there is interest, I will provide more information about the course content and requirements, as things progress.

PDQ Version 4.0

Version 4.0 of PDQ (Pretty Damn Quick) is in the pipeline---it's been there for quite some time, actually (blush). The current hold up is related to getting both the Perl and the new Java version through the QA suite designed by Peter Harding. As soon as that is sorted out, we'll release it; probably in two pieces, keeping the Java PDQ separate initially. Also included will be updates to PyDQ (python) and a new PHP version.

If you would like to be notified by email, please fill out this form.

Guerrilla Certification?

During some recent discussions with a large TelCo client, the issue of certification for the various Guerrilla training classes came up. If I understood correctly, the idea would be to augment each class with some kind of ranking (e.g., Levels I, II, III) to denote the proficiency level achieved by a student taking a particular class. This would be useful to managers who would like to better categorize the level of competency of each employee they send to Guerrilla training. The Guerrilla classes are not currently organized along those lines, but they could be.

There are some possible complicating factors that could creep in. Questions that immediately spring to mind are:

Would the Guerrilla levels just be an internal ranking provided by Performance Dynamics?

Is there any need to have such levels certified by an outside institution e.g., a university?

Is there a need to have such levels associated with continuing education (CEU) credits?

I would be interested to hear from both managers and Guerrilla alumni on this idea.

Virtualization Spectrum

My CMG 2006 paper on virtualization was recently blogged at HP Labs in the context of hyperthreading being considered harmful to processor performance. The paper actually provides a general unified framework in which to understand hyperthreading, hypervisors (e.g., VMware, and Xen), and hyperservices (e.g., P2P virtual networks like
BitTorrent); the latter being an outgrowth of something I wrote in response to an online analysis of Gnutella.

The VM-spectrum concept is based on my observations that: (i) disparate types of virtual machines lie on a discrete spectrum bounded by hyperthreading at one extreme and hyperservices at the other, and (ii) poll-based scheduling is the common architectural element in most VM implementations. The associated polling frequency (from GHz to μHz) positions each VM into a region of the VM-spectrum. Several case studies are analyzed to illustrate how this framework could make VMs more visible to performance management tools.

Performance Visualization

Perf viz for computers (PVC) is both an abiding interest and a pet peeve of mine. I first wrote about this topic in a 1992 paper entitled: "On the Application of Barycentric Coordinates to the Prompt and Visually Efficient Display of Multiprocessor Performance Data." Along with that paper, I also produced what IMHO is an interesting example of an efficient performance tool for the visualization of mulitprocessor CPU utilization called barry (for barycentric or triangular coordinates---get it?), written in C using the ncurses library.

RESEARCH: My role models for PVC are scientific visualization and statistical data visualization. But why should physicists and statisticians have all the fun!? I maintain that similar CHI techniques could be applied to help people do a better job of performancs analysis and capacity planning. I consider a central goal to be to find visual paradigms that offer the best impedance match b/w the digital-electronic computer under test and the cognitive computer doing the analysis (aka your brain). This is a very deep problem because we have a relatively poor understanding of how our visual system works (both the optical and the neural circuits), although this is improving all the time. So it's very difficult to quantify "best match" in general, and even more difficult to quantify it for individuals. One thing we do know is, that vision is based on neural computation; which also explains why the moon looks bigger on the horizion than when you when take a photograph of it in that same location. (How then was the following shot taken?)

Scalability Parameters

In a recent Guerrilla training class given at Capital Group in Los Angeles, Denny Chen (a GCaP alumnus) suggested a way to prove my Conjecture 4.1 (p.65) that the two parameters α and β are both necessary and sufficient for the scalability model:

C(N) =

1 + αN + βN (N − 1)

developed in Section 4.4 of the Guerrilla Capacity Planning book.

Basically, Denny observes that 2 parameters (a and b) are needed to define an extremum in a quadratic function (e.g., a parabola passing through the origin with c = 0), so a similar constraint should hold (somehow) for a rational function with a quadratic denominator. This is both plausible and cool. I don't know why I didn't think of it myself.