Tuesday, December 1, 2009

Guerrilla Capacity Planning: The Movie

For those of you who weren't able to attend the recent Guerrilla Capacity Planning training live in California, here's a small sampler of what you missed (not shown are the high quality lunches we provide):

Guerrilla Capacity Planning class Nov 2009
Instructor: Dr. Neil Gunther

Thomas Crosman did an outstandlng job of getting the entire 5-day class (that's more than 30 hours!) recorded as digital bits—all on very short notice, I might add. This ain't no YouTube vid. The plan is to make this GCaP class available online. Stay tuned to this blog for announcements about when it will appear at a theater near you.

But there's nothing like live! So, the 2010 training schedule has now been posted. The dates are tentative until we finalize the contracts with the hotel, but you may as well start harassing your management to cut that P/O now. :-)

Oh! And if they need a little extra convincing, they can check out the testimonials.

Season's Greetings!

Tuesday, November 24, 2009

GCaP Class: Sawzall Optimum

In a side discussion during last week's class, now Guerrilla alumnus, Greg S. (who used to work at Google a few years ago) informed me that typical Sawszall preprocessing-setup times typically lie in the range from around 500 ms to about 10 seconds, depending on such factors as: cluster location, GFS chunkserver hit rate, borglet affinity hits, etc. This is the information that was missing in the original Google paper and prevented me from finding the optimal machine configuration in my previous post.

To see how these new numbers can be applied to estimating the corresponding optimal configuration of Sawzall machines, let's take the worst case estimate of 10 seconds for the preprocessing time. First, we convert 10 s to 10/60 = 0.1666667 min (original units) and plot that constant as the horizontal line (gray) in the lower part to the figure at left (click to enlarge). Next, we extend the PDQ elapsed-time model (blue curve) until it intersects the horizontal line. That point is the optimum, as I explained in class, and it occurs at p = 18,600 machines (vertical line).

That's more than thirty times the number of machines reported in the original Google paper—those data points appear on the left side of the plot. Because of the huge scale involved, it is difficult to see the actual intersection, so the figure on the right shows a zoomed-in view of the encircled area. Increasing the number of parallel machines beyond the vertical line means that the elapsed time curve (blue) goes into the region below the horizontal line. The horizontal line represents the fixed preprocessing time, so it becomes the system bottleneck as the degree of parallelism is increased. Since the elapsed times in that region would always be less than the bottleneck service time, they can never be realized. Therefore, adding more parallel machines will not improve response time performance.

Conversely, a shorter preprocessing time of 500 ms (i.e., a shorter bottleneck service time) should permit a higher degree of parallelism.

Friday, November 20, 2009

GCaP Class: Odds and Sods

Some interesting side discussions came up in class:

  • Cat Brain: IBM Almaden announced a supercomputer brain-simulation (called C2) in which the number of simulated neurons and synapses exceeds those in a cat brain (from Josh B)
  • MARS: A MapReduce Framework on Graphics Processors (from Josh B)
  • MapReduce ported to R (from Josh B)
  • Tsung: FOSS distributed load-testing harness written in Erlang (from Greg S)
  • GPUs are the new CPUs (from NJG)
  • SmokePing: Another Tobi tool (from Josh B)

Today, we wrap up and Guerrilla Level-2 certificates will be awarded to all attendees (as proof of purchase).

† The claim has since elicited a clawing missive from researchers working on EPFL's BlueBrain project. Quote from the project leader: "[it's] not even close to an ant's brain." (from GCaP 2008 alumnus, Stefan P.)

Wednesday, November 18, 2009

GCaP Class: 3-Tier Queueing Model

Guerrilla Capacity Planning class attendee, Greg S., raised an interesting question during the section where I present a PDQ model of a 3-tier client/server system. The assumptions used to develop the model are summarized on this slide:

In the baseline configuration there are 125 desktop clients generating 3 types of database transactions corresponding to 3 different workload classes, or streams in PDQ parlance. This could be represented as either:

  1. 125 × 3 workload streams or
  2. 1 × 3 streams with 124 × 3 aggregated streams.

The respective arrival rates for the second case look like this:

Friday, November 13, 2009

Scalability of Sawzall, MapReduce and Hadoop

This is a follow-up to a reader comment by Paul P. on my previous post about MapReduce and Hadoop. Specifically, Paul pointed me at the 2005 Google paper entitled "Parallel Analysis with Sawzall," which states:

"The set of aggregations is limited but the query phase can involve more general computations, which we express in a new interpreted, procedural programming language called Sawzall"

Not related to the portable reciprocating power saw manufactured by the Milwaukee Electric Tool Corporation.

More important, for our purposes, is Section 12 Performance. It includes the following plot, which tells us something about Sawzall scalability; but not everything.

Figure 1.

Tuesday, November 10, 2009

EU Queries MySQL in Sun-Oracle Merger

The European Union's statement of objections expresses concerns that businesses might have fewer choices and see higher prices if Oracle (already the world's largest proprietary database vendor) ends up with MySQL by default.

In case you're getting a bit confused by all these fish eating each other, the Wikipedia entry for MySQL reminds us:
The project has made its source code available under the terms of the GNU General Public License, as well as under a variety of proprietary agreements. MySQL is owned and sponsored by a single for-profit firm, the Swedish company MySQL AB, now a subsidiary of Sun Microsystems. As of 2009 Oracle Corporation began the process of acquiring Sun Microsystems; Oracle holds the copyright to most of the MySQL codebase.
Oracle Corp. has stated that the commission's objection "reveals a profound misunderstanding of both database competition and open source dynamics," but some FOSS developers have a different take on that.

Monday, November 9, 2009

Last 2009 Guerrilla Class Next Week

Good news! You can still pile into the last Guerrilla Capacity Planning class for 2009 at the Early Bird rate. Since this class will be professionally videotaped for later distribution on the web, the more the merrier. It's also your chance to be digitally immortalized.

Entrance Larkspur Landing hotel Pleasanton California

As usual, it will be held at our lovely Larkspur Landing location. Click on the image for booking information.

Registered attendees please bring your laptops, as course materials will only be provided on CD or flash drive, this time. We will be distributing free notepads so you can also take hand-written notes. The venue also has free Wi-Fi to the internet.

Tuesday, November 3, 2009

Len Kleinrock Reflects on Booting The Inter-(ARPA)-net

Len Kleinrock (Mr. Queueing Theory) discussed his role in the innovation of packet-switching for the ARPAnet at NPR last week.
Forty years ago this week, the first information was transmitted across the ARPANET, a test message routed from UCLA to the Stanford Research Institute. Though the message sent on the evening of Wednesday, Oct. 29, 1969 was incomplete—the system crashed after the 'L'and 'O' of 'LOGIN' were transmitted to SRI—that packet-switched transmission became the basis of much of our modern era of communications. In this segment, Ira talks with internet pioneer Leonard Kleinrock about that first transmission and what networked computing has become.
Here's the podcast (mp3).

Lucasian Litotes

Well, it wasn't a woman (Gee! I'm shocked), although it could have been but the committee wimped out. And it won't be long. At 63, Mike Green is the oldest appointment yet, and if the retirement rules are applied consistently (which they haven't always been), he only gets four years in the prestigious Chair once held by such luminaries as Newton (at 26) and Dirac (at 30). Conventional wisdom has it that theoreticians are past their "sell-by" date in their late twenties, but Newton didn't write The Principia until his mid forties and Hawking is still publishing in his sixties.

Friday, October 30, 2009

Parallelism in PDQ

All so-called "analytic solvers" for queueing models, including PDQ, assume that the queueing system being modeled is in steady state. Steady state means that in the long run, the number of arrivals into a service facility, e.g., customers arriving at a grocery checkout, will be identical to the number of customers departing. Why is this important?

Wednesday, October 28, 2009

Googling Google + Linux

Honorary Guerrilla alumnus, GB, sent me a link to "How Google uses Linux" which, although it provides an interesting view inside The Goog's datacenter management, looks like it's supposed to be available only to LWN subscribers:
"The following subscription-only content has been made available to you by an LWN subscriber." Eh? 
Not wishing to let any cats out of their subscription bag, I checked with the editor and he said it's ok to blog the link.

It's a BRisk Wind That Blows No Good

OK, I admit it. I can't resist this "I told you so" moment:
  1. BRisk Management
  2. When BRisk Goes Bust
But that doesn't mean I'm happy about it. In fact, I'd much prefer it had not happened. In case you missed what I'm talking about, a 5000 lb chunk of the San Francisco Bay Bridge (not the Golden Gate) fell onto the upper deck last night. Several drivers narrowly avoided potentially fatal injuries as pieces fell onto their vehicles. That's 2 tons or 2268 kg of falling steel, folks!

San Francisco Bay Bridge closure parts on deck
Click for video at KTVU Channel 2

Apparently, it was from the same cracked steel "I-bar" (a hinged strut shaped like the link in a bicycle chain) that delayed traffic flow while it was repaired less than 2 months ago. This failure was supposedly due to vibration from the high winds that prevailed yesterday, not an earthquake. No comment on the quality of the steel.

Monday, October 26, 2009

This Apple Does Fall Far From The B-Tree

That old adage: the apple doesn't fall far from the tree, doesn't apply to Apple the corporation. According to ArsTechnica today, Apple abruptly abandoned its open-source project to port Sun's ZFS as the filesystem for Mac OS X, on October 23rd.

The speculation is that Sun licensing fees may have been viewed as a roadblock to adoption or possibly there are growing concerns that Oracle's acquisition of Sun could cause other problems. In the meantime, Apple is hiring engineers to build its own advanced filesystem, instead of adopting either ZFS or its Linux derivative BtrFS.

Monday, October 19, 2009

Is SCO Waiting for Godot?

Bankruptcy Judge Kevin Gross remarked in his recent ruling that ongoing SCO Group litigation attempts were like a bad version of Samuel Beckett's play, Waiting for Godot. The almost decade-long legal saga gained publicity in the FOSS community for targeting Linux as illegally containing licensed AT&T UNIX System V source code.

Thursday, October 15, 2009

Hadoop, MAA, ML, MR and Performance Data

Over the past few months, I've been attending a series of talks on machine learning (ML), sponsored by ACM.org at the NASA Ames Research Center, with an eye to applying such things to gobs of computer performance data. Two pieces of technology that kept cropping up were Google MapReduce and Apache Hadoop.

Thursday, October 1, 2009

Final Guerrilla Class for 2009 in November

Seats are still available for the final Guerrilla Capacity Planning class of 2009 during November 16-20. All classes are held at our Larkspur Landing location. All 5 days of this class will be professionally videotaped for later online distribution. So, if you want to be digitally immortalized, better get on it.

Entrance Larkspur Landing hotel Pleasanton California
(Click on the image for details)

Who Will Succeed Hawking?

Now that he is 67 years old, it is Cambridge University policy that Stephen Hawking relinquish his title of Lucasian Professor of Mathematics and so, he resigned that post yesterday but he did not retire from Cambridge University. This event raises the question "Who will succeed him?"

Thursday, September 24, 2009

New Performance Papers from VMware

Two new performance whitepapers from VMware:
  1. VMware vSphere 4: The CPU Scheduler in VMware ESX 4
  2. Understanding Memory Resource Management in VMware ESX Server:
Both apply controlled measurements ("Experimental Environment"), as described in my latest Linux Technical Review article.

Monday, September 21, 2009

Performance und Virtualisierung

Autor: Neil J. Gunther
Erschienen: 15.09.2009
Umfang: 13 Seiten
Woran denkt man, wenn man den Begriff Virtualisierung hört? An einen Hypervisor wie Citrix XenServer oder den ESX-Server von VMware? Oder an virtualisierte Services wie beim Cloud Computing? Oder an Multicore-CPUs mit Hyperthreading, die virtuelle Prozessoren ermöglichen? Am besten betrachtet man all diese Erscheinungsformen von Virtualisierung nicht isoliert, sondern als Teile eines einzigen Performance-Management-Puzzles. Dieser Beitrag erklärt wieso und er unterstreicht, wie wichtig es ist, durch kontrollierte Performance-Messungen Daten zu sammeln.

"Virtualization in the Enterprise from the Performance Management Perspective"

When you see the word virtualization, what do you think of? Hypervisors like Citrix XenServer or VMware ESX? Perhaps you thought of virtualized services like cloud computing? What about hyperthreaded multicores that facilitate virtual processors? Rather than thinking of all these forms of virtualization as being completely different from one another, this article explains why it's better to think of them as being pieces of the same performance management puzzle.The importance of doing controlled performance measurements is also emphasized.
Linux Technical Review

Saturday, September 19, 2009

Can Haz Guy: USA CIO Vivek Kundra

Columnist Chris O'Brien's interview with Vivek Kundra in the San Jose Mercury News, appears in "How new CIO has brought innovation to government" . . . "a quiet revolution in the way the federal bureaucracy works that may change our view of government for the better."

Monday, September 14, 2009

Anti Log Plots

A sure sign that somebody doesn't know what they're doing is, when they plot their data on logarithmic axes. Logarithmic plots are almost always the wrong thing to use. The motivation to use log axes often arises from misguided aesthetic considerations, rather than any attempt to enhance technical understanding, e.g., trying to "compress" a lot of data points into the available plot width on a page. The temptation to use log plots is also too easily facilitated by tools like Excel, where re-scaling the plot axes is just a button-click away.

Wednesday, September 9, 2009

This is Your Measurements on Models

When it comes to analyzing scalability data, I've stressed the importance of bringing measurements and models together. Some recent conversations with people who are just beginning to model their scalability data using the Universal Scalability Law (USL), have led me to realize that I have not made the steps behind the procedure as clear as I might have. So, let me address that here.

Sunday, August 30, 2009

Seeing Molecules: Kekulé's Dream Writ Large

As a chemist in a former life, I can't help but comment on this watershed moment in science, even though it's probably been blogged to death. Nanotechnologists at IBM Zürich have imaged the naturally occurring organic molecule pentacene (essentially, 5 benzene ring-molecules bolted together in a row). Why is this a big deal?

Saturday, August 29, 2009

Block The Emergency Exit for Faster Evacuations

NewScientist reports that Japanese physicists timed a crowd of 50 women(?) as they exited as fast as possible through a door. They then repeated the experiment with a 20 cm (10 inch) wide pillar placed 65 cm (2 feet) in front of the exit to the left-hand side. The obstacle increased the exit throughput by an extra seven people per minute.
"Usually, the exit becomes clogged by people competing for the small space, and the crowd is slowed. The pillar blocks pedestrians arriving at the exit from the left so effectively that the number of people attempting to occupy the space just in front of the exit is reduced, says Yanagisawa. With reduced crowding there are fewer conflicts and the outflow rate increases."

How does this apply to computer performance? Think polling systems, where there are multiple waiting lines or buffers but only one service facility, like a grocery store with the usual checkout lanes but only one cashier running between them! Would you want to shop in that store? In the physics experiment, the exit is the single server and the lines are the streams of people (women?) approaching the exit from all angles. The asymmetric placement of the pillar effectively reduces the number of exit streams that can form (I'm guessing).

Sunday, August 23, 2009

SPAD Quantum Camera: The Owner's Manual

For those of you following my travails in quantum information processing, our most recent work just appeared in the prestigious open-access journal Optics Express, published by the Optical Society of America, under the title: "On The Application Of A Monolithic Array For Detecting Intensity-Correlated Photons Emitted By Different Source Types." (PDF)

Saturday, August 22, 2009

Bandwidth and Latency are Related Like Overclocked Chocolates

Prior to the appearance of special relativity theory (SRT) in 1905, physicists were under the impression that space and time are completely independent aspects of reality described by Newton's equations of motion. Einstein's great insight, that led to SRT, was that space and time are intimately related through the properties of light.

Space and time are related

Instead of objects simply being located at some arbitrary position x at some arbitrary time t, everything moves on a world-line given by the space-time pair (x, ct), where c is the universal speed of light. Notice that x has the engineering dimensions of length and so does the new variable ct: a speed multiplied by time. In Einstein's picture, everything is a length; there is no separate time metric. Time is now part of what has become known as space-time—because nobody came up with a better word.

Tuesday, August 18, 2009

Response Time Knees and Queues

How do you determine where the response-time "knee" occurs? This is a question one commonly hears with reference to characterizing the performance of an application. Calculating where the response time suddenly begins to climb dramatically is considered, by many, to be an important determinant for such things as load testing, scalability analysis, and setting application service targets.

In a previous blog post, I pointed out that such a "knee" is actually an optical illusion. Nonetheless, this same question arose in last month's CMG MeasureIT, as a kind of survey entitled "Does the Knee in a Queuing Curve Exist or is it just a Myth?" Although that author concludes (correctly) that the existence of a "knee" (as it is usually meant) is bogus, the panoply of responses was quite astounding—especially coming from professionals who ought to know better. In this month's MeasureIT, I examine the same question in a rigorous but unconventional way under the title "Mind Your Knees and Queues: Responding to Hyperbole with Hyperbolæ."

Thursday, August 13, 2009

A Funny Thing Happened on the Way to the Priority Queue

Suppose two workloads $W_a$ and $W_b$ access a common resource, e.g., the CPU. They each have response times $R_a$ and $R_b$, respectively. The response time $R_a$ is longer than you would like. A common way to try and improve $R_a$ is to give $W_a$ a higher priority at the CPU. That's what the Unix nice command is all about. For example, if $W_a$ were a Unix process then: nice -15 Wa, would give a higher priority than the default already assigned by the Unix scheduler. But how much better will it's response time be? Nice can't tell you that.

Tuesday, August 11, 2009

Towards a Cloud Capacity-Cost Formula

One of the (unscheduled) plenary sessions at Velocity 2009, was entitled: “Why elasticity, performance, and analytics will change how Webops is judged" (PDF), given by Alistair Croll. An earlier version of Alistair's ideas can be read on his blog. As I understand it, he's attempting to tie together the capacity-on-demand concept of cloud computing with the way a user is charged for resource consumption and how the provider counts revenue; a kind of dynamic capacity planning and chargeback association. Currently, for example, Amazon EC2, Google App Engine and Salesforce, all do this differently. This looks like a very important point, which I would like to understand more thoroughly. By slide 3 in his presentation, he refers to a simple capacity formula and that's what I want to discuss here, because that's what suddenly locked up my attention.

Monday, August 3, 2009

Starbucks Discovers Performance Analysis!

According to the WSJ, Starbucks "vice president of lean" (and apparently mean), has discovered performance analysis.

Heeellooooo! That would be The Principles of Scientific Management, developed by Frederick Winslow Taylor almost a century ago. Of course, it's uncool to be a prophet in your own land, so more notice was eventually taken in Japan then the USA, after WW-II. Baristas will probably be less than bullish on it, but they can take heart that this genius idea by the VP of Lean is totally pre-Toyota.

Wednesday, July 29, 2009

Remembering Mr. Erlang as a Unit

Not to be confused with Frank Zappa's daughter or a Coneheads spousal unit, is the Erlang unit. The number of Erlangs (E) is defined as:

E  =  λ S,(1)

and is a well-known in the teletraffic industry, where it was first used as a measure of call capacity. For this reason, it's also called the traffic intensity.

Tuesday, July 28, 2009

Scalability in a Spreadsheet - Google Style

Speaking of spreadsheets, it's always nice when someone, who uses your ideas, takes the time to write to you about it. Case in point, Scott Roberts sent me the following email, telling me how he'd set up my USL model in Google Docs spreadsheets.

Ignite! San Jose 2009: The Afterburn

It's been over a month since I did the Ignite event in San Jose, but I have simply had so many things to do since then, that I'm only catching up on blogging about it now.

The title of my Ignite talk was "Scalability for Quantheads: How to Do It Between Spreadsheets" (a pitch for applying my Universal Scalability Law to Web 2.0 applications using Excel spreadsheets). Since scalability is about sustainable size, I used the theme of giants as a hook. Why are there no 30 ft (10 m) giants like the one in the "Jack and the Beanstalk" fairytale? Officially, there have been no human giants taller than 10 feet (3 m); and even they need leg supports. The reasons are given in my Chap. 4 of my Guerrilla Capacity Planning book.

Friday, July 3, 2009

A Thomas Jefferson Enigma for July 4th

This post isn't about computer performance per se, although see end, and I certainly have better things to do with my time, but when I read this slashdot item about a 200 year old cipher, I couldn't help wondering what it would look like as a modern computer algorithm.

Monday, June 29, 2009

PDQ 5.0 Test Suite or ... How I Spent My Weekend

I was planning to blog about the amazing time I had at Velocity 2009 last week, when this landed in my mailbox (edited for space and privacy):

Subject: Seeking help with PDQ-R ...
Date: Thu, 25 Jun 2009 15:51:21 -0500

My name is James and I've been trying to learn to properly use PDQ after reading two of your books, "Guerrilla Capacity Planning" and "Analyzing Computer System Performance with Perl::PDQ." I'm still getting a grip on PDQ-R. ... I decided to set about of re-creating the queue circuit in the study with PDQ-R as an exercise. ...
The output of my code yields:
[1] "Manual response time for class 1 is 0.864179 seconds"
[1] "PDQ-R response time for class 1 is 0.313637 seconds"
[1] "Manual response time for class 2 is 6.105397 seconds"
[1] "PDQ-R response time for class 2 is 3.552873 seconds"
[1] "Manual response time for class 3 is 4.535833 seconds"
[1] "PDQ-R response time for class 3 is 4.535833 seconds"
If you could give my code a look over and give me some hints I would really appreciate it.

August Guerrilla Class: Using R for Performance Analysis

Registrations are still open for the Guerrilla Data Analysis Techniques (GDAT) class being held August 10-14, 2009. The focus will be on using R and the new release of PDQ-R for performance analysis and capacity planning.

All Guerrilla classes are held at our Larkspur Landing location in Pleasanton, California; a 45-minute BART ride to downtown San Francisco.

(Click on the image for details)

For those of you coming from international locations, here is a table of currency EXCHANGE rates. We look forward to seeing all of you in August!

Tuesday, June 16, 2009

Ignite! Velocity (San Jose): I'm Stoked!

Just found this in my e-mailbox
PLEASE READ: Ignite @ Velocity
Tuesday, June 16, 2009 1:41 PM
From.... oreilly.com

You are accepted to speak at Ignite Velocity.

Monday, June 8, 2009

Bridges, Booms, Busts, Banks, Bailouts, ... Who Needs Capacity Planning?

Given that Wall Street management has proven once again that there are black swans; this time on a global scale, why would anyone be crazy enough to contemplate capacity management in the middle of such a mess?

See what Wall Street IT managers Simple CIO and Sal Viati have to say about it in "Let the Bridge Fall—As Long as It Falls on Time" (with apologies to Galileo).

"The commonly held idea that it's cheaper to over-engineer the hardware architecture to ensure adequate capacity is patently false. Here's the simple counter-example. If performance testing is skipped in order to meet the release schedule (and who knows if that's really valid?), and the deployed application ends up running single-threaded with lousy performance, a boat-load of the cheapest servers from China won't improve that."


"The bottom line is not really new. The sagacity of looking beyond the end of your nose is a truism, but incredibly that truth has been lost in the irrational exuberance of false Wall Street economics. A robust economy and IT customer satisfaction both come from foresight, not just eyesight. In fact, it's the second word in capacity planning.

Lest you think I'm being too hard on Wall St., listen to Peter Day of the BBC interviewing Philip Delves-Broughton about his new book entitled, What They Teach You at Harvard Business School: My Two Years in the Cauldron of Capitalism. Some points to listen for:
  • MBAs are not taught to get their hands dirty with such sleazy activities as sales. That's sales as in: salesforce, sales people, the Fuller Brush Man.
  • Neither Steve Jobs nor Bill Gates have an MBA.
  • Too much devotion to spreadsheet calculations and powerpoint presentations. This is why Robert McNamara (Harvard MBA) mis-managed the Vietnam War. Too much faith in (manipulated) uncorroborated numbers.
  • Total disconnect between teaching abstract business models and the business of business which is, err... like... selling stuff.
  • These are the people running things! (Let's read that again)
  • The Economist rankings: French model beats Anglo-Saxon model (of which Wall St. is obviously a subset).
Update (Tue, Jun 9, 2009): George Soros (hedge-funder extraordinaire) estimates this black swan could be 3-5 times bigger than the one seen in 1929. Update (Wed, Jun 10, 2009): Nobel economist, Joseph Stiglitz, slams Wall St. for tarnishing the reputation of American-style capitalism, which may pose new threats to global stability and U.S. security. Update (Mon, Jul 20, 2009): Associate director of M.I.T. Media Lab considers institutional monocultures in his Boston Globe op-ed piece: "What can failures teach us?"

Monday, June 1, 2009

Data + Models == Insight

Al Bundy, of the TV show Married with Children, understood it and performance engineers should too. What am I talking about? The theme music for that show is the tune "Love and Marriage" as sung by Frank Sinatra. Just like the song says about love and marriage, so it is with measurements and models ... You can't have one without the other.

Sunday, May 31, 2009

Top 10 Killer Apps of All Time (so far)

Here, "killer" doesn't necessarily mean just first or just best implementation, but rather it was also considered meritorious if it made a truck-load of money.
  1. Oracle database
  2. PGP (why didn't this catch on more; especially for email?)
  3. Apache
  4. Microsoft Office
  5. Antivirus Toolkit
  6. Adobe Photoshop
  7. SNDMSG (? Me neither)
  8. Lotus 1-2-3
  9. Quark Xpress
  10. Mosaic
Judges' reasoning is presented in iTnews.

Saturday, May 23, 2009

Gunther Interview - Part II

As a consequence of winning the A.A. Michelson Award at CMG'08, I was interviewed for CMG MeasureIT e-zine. The second installment appears in this month's issue. Free access, but requires sign-up if you're not already registered.

As you can see, I still have my Golden Book of Chemistry Experiments, although it's showing signs of wear by now. It's open at the one experiment I could never get to work; making rayon. I'm now inclined to think there might be a bug in the recipe, but that never occurred to me back then. I just wanted to make it in the worst way.

Friday, May 15, 2009

WolframAlpha Performance Degradation

Surprise, surprise! After the big wind-up, it turned out that WolframAlpha wasn't really ready for prime time. In an LA Times interview today, Stephen Wolfram, the creator of the site -- five years in the making -- sheepishly explained that a large-scale traffic simulation test had failed. Oops!

Cloudy Web 2.0: So Much for the 5th Utility

A real utility, like water, gas and POTS, implies that it's always there, with only very rare and explainable exceptions. Which reminds me, did they ever figure out who hacked (as in "chopped") the major phone cables in Santa Clara County, last month?

Tuesday, May 12, 2009

Negative Scalability Coefficients in Excel

Recently, several performance engineers, who have been applying my universal scalability law (USL) to their throughput measurements, reported a problem whereby their Excel spreadsheet calculations produced a negative value for the coherency parameter (β < 0) on what otherwise appears to be an application that scales extremely well. You can download the Excel spreadsheet sscalc.xls from the Guerrilla class materials. Negative USL parameters are verboten.

Tuesday, May 5, 2009

Swine Flu and Conficker: Parallel Worlds Collide

Sick of hearing about "swine flu"? Good, then read this blog instead. It strikes me that there are more than the usual uncanny parallels between these tiny molecular machines, aka viruses, and the tiny digital machines, aka viruses or worms; not the least of which is people's reaction to them, viz., out of sight, out of mind.

Queues, Schedulers and the Multicore Wall

The other day, I came across a blog post entitled "Server utilization: Joel on queuing", so naturally I had to stop and take a squiz. The blogger, John D. Cook, is an applied mathematician by training and a cancer researcher by trade, so he's quite capable of understanding a little queueing theory. What he had not heard of before was a rule-of-thumb (ROT) that was quoted in a podcast (skip to 00:26:35) by NYC software developer and raconteur, Joel Spolsky. Although rather garbled, as I think any Guerrilla graduate would agree, what Spolsky says is this:
If you have a line of people (e.g., in a bank or a coffee shop) and the utilization of the people serving them gets above 80% , things start to go wrong because the lines get very long and the average amount of time people spend waiting tends to get really, really bad.

No news to us performance weenies, and the way I've sometimes heard it expressed at CMG is:
No resource should be more than 75% busy.
Personally, I don't like this kind of statement because it is very misleading. Let me explain why.

Thursday, April 30, 2009

GCaP Seats Still Available

Although the GBoot class on May 7-8 will now be closed, seats are still available for the Guerrilla Capacity Planning class on May 11-15, but it's getting down to the wire to book for that one too.

All Guerrilla classes are held at our Larkspur Landing location.

Entrance Larkspur Landing hotel Pleasanton California
(Click on the image for details)

For those of you coming from international locations, here is a table of currency EXCHANGE rates. We look forward to seeing all of you here!

Friday, April 24, 2009

Performance Short Course in Switzerland

On June 25 and 26 2009, I will be presenting a 2-day short course on performance analysis and capacity management at Trivadis AG in Zürich, Switzerland.


This class is especially accessible if you are located in central Europe. Since it will come hot on the heels of the TrivadisOPEN conference (23.-24. Juni 2009), it should also be of particular interest if you are responsible for ORACLE database performance management.

Monday, April 20, 2009

Oracle Buys Sun Microsystems (Really!?)

I just read it (7am) and ... I'm speechless.

Thinks ....
  • Larry doesn't do hardware.
  • Decimation à la PeopleSoft?
  • Oracle still runs on IBM, and HP, et al.
  • Wherefore MySQL? Just a cheap shoehorn for the Oracle RDBMS?
  • Solaris (vs. Linux, which Oracle Corp has been pushing)? Ah! SMP scalability
  • And Java? (that made sense for IBM but...) Ah ha! Larry also owns Weblogic!
  • Can't think... Need coffee ...
  • Wait! What about OpenOffice? Oh oh!
Post café noir, this EETimes article seems to hit the salient points (modulo my JVM/Weblogic/J2EE observation). Update (April 24): The Oracle @cringely weighs in on the Sunset. [ He needs to read my blog. :) But he does have the IBM memo ]

Saturday, April 18, 2009

Google's New CAPTCHA: Orient This!

The abstract in this project report (PDF) summarizes it well:
Abstract: We present a new CAPTCHA which is based on identifying an image’s upright orientation. This task requires analysis of the often complex contents of an image, a task which humans usually perform well and machines generally do not. Given a large repository of images, such as those from a web search result, we use a suite of automated orientation detectors to prune those images that can be automatically set upright easily. We then apply a social feedback mechanism to verify that the remaining images have a human-recognizable upright orientation. The main advantages of our CAPTCHA technique over the traditional text recognition techniques are that it is language-independent, does not require text-entry (e.g. for a mobile device), and employs another domain for CAPTCHA generation beyond character obfuscation. This CAPTCHA lends itself to rapid implementation and has an almost limitless supply of images. We conducted extensive experiments to measure the viability of this technique.

The performance of image retrieval is an important subject because it can be so dependent on how images are classified and keyed (e.g., human text description or automated feature extraction). The heavy-duty word is ontology. A colleague at HP Labs and I created a benchmark (called BIRDS-I) to measure the performance of content-based image retrieval (CBIR). See HP Labs Technical Report HPL-2000-162.

And, at the intersection of security and AI:
"Software that can solve any text-based CAPTCHA will be as much a milestone for artificial intelligence as it will be a problem for online security."

Intel Goes Giddy-up on GPUs

Parallel computing is not just about CPUs, anymore. Think GPUs and (perhaps more importantly) the dev tools to utilize them. From a piece by Michael Feldman at HPCwire:

"With the advent of general-purpose GPUs, the Cell BE processor, and the upcoming Larrabee chip from Intel, data parallel computing has become the hot new supermodel in HPC. And even though NVIDIA took the lead in this area with its CUDA development environment, Intel has been busy working on Ct, its own data parallel computing environment." ...

"Ct is a high-level software environment that applies data parallelism to current multicore and future manycore architectures. ... Ct allows scientists and mathematicians to construct algorithms in familiar-looking algebraic notation. Best of all, the programmer does not need to be concerned with mapping data structures and operations onto cores or vector units; Ct's high level of abstraction performs the mappings transparently. 'The two big challenges in parallel computing is getting it correct and getting it to scale, and Ct directly takes aim at both' ..."

Ct stands for C/C++ for throughput computing.

Friday, April 17, 2009

Guerrilla CaP Training in May

Seats are still available for my GCaP class, May 11-15.

Because of the pesky econo-crunch, I'm aware of quite a few people trying to do CaP on their own, directly from my book. You have my deepest admiration and I would love to think my books are that well written, but the reality is DIY is a very tough slog for such a complex subject. You wouldn't try to fly an aircraft without training.

Coming to my class is like providing you with the proverbial hot knife through butter. I say this simply on the basis of the questions students ask me in class. Each point of confusion can be addressed with a few minutes discussion and then we move on the next item. Of course, we all learn from your particular question. After that, the GCaP book, together with the class notes, becomes a reference source, which you can read at your own pace. This kind of personal Q&A, to get you over any conceptual hurdles is not something that can be accomplished through (so-called) video-learning either. 'Nuff said about that ... or I'll start pushing my own buttons.

I'm also aware that people who want to attend cannot, because their employer has zipped up the corporate travel budget. That's understandable and not something I can change. However, I can come to you. Sometimes, this is a more attractive option to management. Ask them.

Barry3-Apdex Also Lives in R

As a by-product of my presentation on the Apdex Index at the NorCal CMG meeting, back in February, Guerrilla graduate, Stephen O'Connell, went off and did an implementation in R. You can read about it in this month's CMG MeasureIT and download his R-script. Free access, but requires sign-up if you're not already a member.

Gunther Interview - Part I

As a consequence of winning the A.A. Michelson Award at CMG'08, I was interviewed for CMG MeasureIT e-zine. The first installment appears in this month's issue. Free access, but requires sign-up if you're not already a member.

Thursday, April 16, 2009

I Really Don’t Know Clouds at All (video)

As the well-known cloud-architect, Joni Mitchell, said so presciently:
"I’ve looked at clouds from both sides now
From up and down, and still somehow
It’s cloud illusions I recall
I really don’t know clouds at all”
Or, as Larry Ellison put it more succinctly, "What The Hell Is Cloud Computing?"

Like all things Web 2.0, there's an overabundance of fascination with what can be done vs. how fast it can be done or how many things can be done, before the system might fail to scale.

Saturday, April 11, 2009

A Change is as Good as a Holiday

Not only is it Easter, it's spring! So, after staring at colored dots and white text on a dark background for more that 2 yrs (the default Courier font was particularly illegible), I decided to call in the painters and redo the whole place in new colors (and fonts).

Oh, and I changed the title too, for good measure (not to mention alliteration).

Thursday, April 9, 2009

Assessing USL Scalability with Mixed Business Functions

Professional capacity planner, Raja C., has been applying my Universal Scalability Law (USL) in some fascinating and progressive ways. By that I mean, fascinating to me, because I hadn't thought about applying the USL model in the same way he has; I don't have a real job, you understand. On the other hand, this may well represent the situation that many of you are faced with on a day-to-day basis, so I'd like to present and discuss Raja's question here in some detail.

Sunday, April 5, 2009

The End of Computer Performance Modeling?

I haven't had time to digest all the details, but there's been a big song and dance this week about a supercomputer program doing in a day what took scientific minds centuries to accomplish: extrapolating Newton's laws of motion from the recorded motion of a pendulum. This diagram says it all; phenomenon in, model out:

[Source: Wired magazine]

From another perspective, this is also the holy grail of computer performance analysis: convert monitored performance data directly into performance models, feed those predictions or trends back to the computer and let it tune itself (so we can all go home).

Forged in the USA

Little's law, Jackson's theorem, and JIT are concepts that we associate with computer performance analysis and capacity planning for IT systems. But the truth is, these ideas were forged while solving business and manufacturing problems. It's also true to say that both IT and manufacturing in the USA have suffered dramatically from the effects of rapid offshoring in the new global economy. In the past decade, 5 million manufacturing jobs have been eliminated in the USA.

Walmart is one of the few USA retailers that has managed to do reasonably well in this economic recession, because they (patriotically?) cut USA manufacturers loose in favor of setting up factories offshore; mostly in China. Indeed, if you look at the tags on the items in a Walmart store, you won't see Made in Anywhere USA. Instead, it's anywhere but USA. Or so it would seem.

This piece: Despite Job Loss, U.S. Manufacturing Still Leads , on NPR this morning, caught my attention when it was pointed out that the USA still ranks as the number one manufacturer, producing some 20% of all goods in the world, ahead of Japan and well ahead of China. The reason this appears to contradict simple observation (like looking at Walmart tags) is that manufacturing in the USA is two levels of indirection away. The USA manufactures things that manufacture things. In other words, USA companies make heavy equipment and machines that are used in factories and on assembly lines that make the goods we buy, and that's not stated on those Walmart tags. But that's where Little, Jackson and JIT came in.

Saturday, April 4, 2009

New Guerrilla Google Group

I've just launched a new Google Group called "Guerrilla Capacity Planning", a long overdue forum for questions, comments and ideas about computer system and data network capacity planning, system sizing and performance analysis for enterprise data centers and large-scale web sites; whether or not you apply guerrilla-style or regular CaP techniques. You do not need to be a GCaP alumnus or have purchased the book of the same name.

Come and feed the monkey!

PDQ 5.0 is on the Launch Pad

PDQ (Pretty Damn Quick) major release 5.0 is on the launch pad at Cape SourceForge. Because of a potential collision with the North Korean ICBM/satellite launch, we won't be filling the main liquid-hydrogen tank until next week (we don't want PDQ blamed for starting WW3). Of course, if you hijack the capsule ahead of time, we can't stop you, but be aware that you might not make it to full earth orbit. :-)

The mission for PDQ 5.0 includes exploration of:
  • Multiserver queues as defined in Ch. 6 of the Perl::PDQ book
  • Integration with an installed R package
PDQ-R examples have been blogged previously. Please standby for the countdown ... Update: PDQ 5.0 was launched successfully on Thursday, April 9, 2009.

Thursday, April 2, 2009

Modern Microprocessor MIPS

The question of how modern microprocessors compare with mainframe processors of yore, arises from time to time. The vernacular rate metric that has persisted for a long time (long in the history of computers, that is) is MIPS. Whether you approve of MIPS as a valid performance metric or not is a different (philosophical) question. Since the mainframe has not gone away---it's just another server on the network today---even mainframers still talk about MIPS ratings. Nonetheless, it is true that the meaning of "instructions" does vary significantly across architectures so, one does have to exercise caution when making inter-architectural comparisons and not endow any conclusions with more credibility than they deserve.

Browser Wars: IE and Opera in decline

Latest stats from ars technica:

  • IE ........... 66.82%
  • FF ........... 22.05%
  • Safari ........ 8.23%
  • Chrome ..... 1.23%
  • Opera ........ 0.07%

These numbers (quoted to 4 sigdigs) don't sum to 100%:
> sum(66.82,22.05,8.23,1.23,0.07)
[1] 98.4
according to R. Caveat lector!

Tuesday, March 31, 2009

Learning to Live with Virtualization

Comment at ArsTechnica:
"One of the big things that I've learned and that's been recently reinforced for me, is that you need a whole mindset change about how you build servers, how you evaluate what goes into your standard builds, how you monitor...

Think, for example, about a daemon that wakes up once an hour to check on something (it doesn't matter what...). Once an hour's not much, right? Except you have 100 guests, so that's once every 36 seconds, on average. Still sound like a lightweight process? Still need to be in your standard build? Likewise a daemon that eats 100 MB of RAM, or blindly installing packages because you "might need to use gcc some time". Yeah, maybe you might. Is it worth the storage you just ate?"

Couldn't agree more; especially the part about different monitoring requirements for VMMs.

Monday, March 30, 2009

Conficker Worm Defense for Enterprises

A blog-post over at ZDNet describes how enterprise data-ops can use network scanners to detect and disarm the Conficker Worm vulnerability on Microsoft Windows platforms, before it potentially wakes up on April 1st and begins to disable your anti-virus software.

One wonders whether the ZDNet writer was suffering from subliminal puritanism. The name of the worm is Conficker, not ConfLicker (sic). Presumably, the name is derived from the conjunction of Con, as in confidence trick (to enable it) and ficken, the German F-bomb (when it disables you).

Confession: Aye, 'twas I who corrected the ZDNet scribe. :-)

See the US-CERT Technical Cyber Security Alert TA09-088A, for more information.

Sunday, March 29, 2009

More Guerrilla Boot Camp Classes in 2009

Additional opportunities to attend the popular Guerrilla Boot Camp class have now been scheduled at the Larkspur Hotel. Book early, book often. All classes have a Certification Level 1, 2, 3. GBoot is a level-1 introductory class. There are no prerequisites at the moment but you are advised to take them in order; especially levels 2 and 3.
This year, more than ever before, is all about doing more with less. There are no shrink-wrap solutions, so your company needs you to do it for them. And for that, you need to attend this class. As Wall Street has so amply demonstrated, being penny wise but pound foolish, simply doesn't work.
Overseas attendees are welcome. Check out your exchange rate against the Yankee greenback. We hope to see you there!

Tuesday, March 24, 2009

Slacker DBs in the Cloud Base

In my view, another reason Larry Ellison diss'd Cloud Computing last year (even though he promoted "Thin Clients" a decade ago, but completely overlooked the necessary infrastructure to support it: aka the cloud), is that he's afraid of how it might negatively impact sales of the ORACLE RDBMS. Why? Most of the world's data is not in relational form, and never will be. More importantly, Google knows this. (Think MapReduce)

One of the first people to see this coming was relational database academic, Joe Hellerstein at UCB. In his 2001 talk entitled "We Lose" (PDF slides), slide 5 contains the gist of his prescient observations:

  • Grassroots use Filesystems, not DBs
  • Grassroots use App servers, not ORDBs
  • Grassroots write Java, PERL, Python, PHP, ... NOT SQL!

He defines Grassroots as: "Hackers. But also DBMS engineers, Berkeley grads, Physicists, etc."

Now, somewhere in between are Slacker Databases: "Amazon SimpleDB, Apache CouchDB, Google App Engine, and Persevere, offering far greater simplicity than SQL, may have a better way of storing data for Web apps." Hellerstein was more right than he could've known.

Nine-Day NetBooks

I claim the "NetBook" (whatever the hell that really is) will turn out to be a 9-day wonder: less utility then a laptop, too big to put in your pocket. End of story. :-)

Monday, March 23, 2009

Sprint Looks Beyond Cellphones

Very interesting piece in the WSJ about Sprint's strategy to offset its inability to sign up enough new cellphone subscribers. The main idea is to sell wi-fi capacity to manufacturers of gadgets e.g, GPS devices, automobile dashboard computers, etc.

A largely unknown factoid, reported in this piece, is that Sprint handles wi-fi book downloads for the Kindle reader at Amazon.com.

From another perspective, it's interesting to note that the same capacity planning paradigms (e.g., queueing theory, scheduling algorithms, game theory) can be applied to both data networks and manufacturing systems. The umbrella term is operations research.

Streaming Hadoop Data Into R Scripts

Along the lines of Mongo Measurement Requires Mongo Management, the HadoopStreaming package on CRAN provides utilities for applying R scripts to Hadoop streaming.

Hadoop has been deployed on Amazon's EC2. See our more recent ACM article, "Hadoop Superlinear Scalability: The Perpetual Motion of Parallel Performance" for a more detailed discussion about scalability issues.

Higgs Slapping Starts Early

As I said in my A.A. Michelson Award acceptance speech, the search for the Higgs boson could turn out to be the 21st century null-experiment that supersedes the 19th century Michelson-Morley search for the aether. The big difference is in the amount of data that will be generated by the LHC, viz., 15 PB per year.

Since finding the Higgs in all those data will be like searching for the proverbial "needle," the pressure is on to justify the investment in the European machine (LHC-CMS for $10B) at CERN and the lack of investment by the U.S. Congress in the Texas Supercollider (SSC for $12B); much less than a bank bailout today. The proxy for the SSC is the aging machine at Fermilab. Because of the pressure to see something, I fully expect a lot of false positives to be reported and that will inevitably degenerate into arguments over confidence intervals for the data; just the kind of thing we discuss in the GDAT class next August.

However, I didn't expect things to really heat up until the LHC comes back online in the summer, after repairs to the collapsed superconducting magnets. In the meantime, however, the global economy has also collapsed and Fermilab is hurting for funds. So, while the LHC is down for the count, the Fermilab Dzero experiment is looking for the Higgs and getting in the news by setting some bounds on the energy ranges where the Higgs might live. Without getting into too much detail, the above diagram shows that the plausible range for the Higgs mass (mH) is 114 GeV < mH < 185 GeV (according to Fermilab). For reference, your analog TV set produces electrons that hit the screen with an energy of about 30 KeV. Mass and energy are directly related by Einstein's famous equation E = mc2, where c is the speed of light in vacuo.

This opportunistic move has set off a slapfest between some physicists at Fermilab and the CERN. If it's this ugly now, I don't know where it's going to go when those gaps close down to zero; apart from the obvious escape route that it's much heavier than 250 GeV.

Sunday, March 22, 2009

Twitts of the World, Unite!

The great thing about email is, you can ignore it. One of the things I can't stand about skype and IM is that, by design, they are very intrusive (or can be), to the point where I can't think straight. I'm slow, so I need a lot of uninterrupted time to think. Thus, I've held the same opinion, a fortiori, about Twitter. The very name has been its own aversion, for me.

Friday, March 20, 2009

Gmail 5 Second Retrieve Reprieve

I've been waiting for someone to come up with this (cache it first) implementation for email. I'm not sure 5 seconds is long enough.

Thursday, March 19, 2009

IBM Might Swallow the Sun

"Shares of Sun Microsystems, which makes the Java software that runs many Internet applications, were up 78.9 percent after reports that it was in talks to be acquired by I.B.M. Shares of Sun ended at $8.89. I.B.M. was down 1 percent, to $91.95."
I heard this rumor at the Portland CMG meeting yesterday. Apparently, Sun has been quietly "looking for a date" for some time. Presumably, IBM's main interest is in Java IP. Will Solaris replace AIX (under the covers)?

I had a long-standing theory that Sun Microsystems would be bought by Fujitsu Corp to simply milk Solaris service contracts for the next 10 years. It's not interesting innovation, but it is a business. Sun has always managed to have enough cash in the bank to be able to forestall such a move, but now, they're out of gas.

Update: Why an IBM purchase of Sun would make sense (cnet)

Wednesday, March 11, 2009

Treemap Visualization of Disk Volumes

GrandPerspective is a FOSS tool for Mac OS X that provides a treemap visualization of file layout on a disk. I created the treemap below from an 80 GB disk on my G4 towermac, which has both Mac OS X files (left) and WinXP files (right); the latter being a copy from the disk of my recently deceased Sony laptop). It certainly gives new meaning to the term disk blocks.

It's quite striking to see the greater number of larger aggregations of files on the Mac side vs. the many smaller files on the XP side. I guess that's why we don't need to do "defragging" on macs. :-)

Monday, March 9, 2009

i-Screen, u-Screen, Vee All Screen for Which Screen?

When I first came to the USA, it quickly became apparent that there was no such thing as, ice cream. You had to specify what flavor, what combination of flavors, what kind of cone, what you wanted on top of it, and so on. This is all enshrined in the song I scream, You scream, We all scream for Ice Cream. Coming from England, I was not used to dealing with such a wide spectrum of choices for such a simple thing as ice cream. And England had the worst ice cream I had ever tasted, made from hydrogenated vegetable oils; margarine, basically. But it only took a few "experiments" to catch on to the more complex American approach.

Monday, March 2, 2009

Michelson Comes Home to California

At the CMG conference in Las Vegas last December, I was presented with the A.A. Michelson Award. It actually consists of 2 pieces: a framed citation, which you can see (and hear President Cathy Nolan reading) in the video of the ceremony, and a wooden plaque with lots of brass bits on it; including a ruler for performance measurement. :-)

Boot Camp Training - Window is Closing

Since we're now inside the 1-month window for reservations at the Larkspur Hotel, they will start releasing the discounted guest rooms previously held for the Guerrilla Boot Camp class. Seats are still available, so if you're planning to come on March 26th, you need to get on it.

All classes have a Certification Level 1, 2, 3. GBoot is a level-1 introductory class. There are no prerequisites at the moment but you are advised to take them in order; especially levels 2 and 3.
This year, more than ever before, is all about doing more with less. There are no shrink-wrap solutions, so your company needs you to do it for them. And for that, you need to attend this class. As Wall Street has so amply demonstrated, being penny wise but pound foolish, simply doesn't work.
Overseas attendees are welcome. Check out your exchange rate against the Yankee greenback. We hope to see you there!

Friday, February 27, 2009

Plotting PDQ Output with R

One the nice things about PDQ-R (coming in release 5.0) is the ability to plot PDQ output directly in R. Here's a PDQ-R script, together with the corresponding graphical output, that I knocked up to show the effect on the throughput curve of adding more queueing delay stages (K), with everything else held constant.

Sunday, February 22, 2009

PDQ-R Lives!

After some fiddling to get things linked correctly to the R binaries on my new Macbook, the first PDQ-R test model has run successfully! Here 'tiz ...

This is an important step for PDQ development and is due entirely to the efforts of Phil Feller. Naturally, this capability will be included in the next PDQ release from SourceForge, which we are currently working towards. Stay tuned!

Tuesday, February 17, 2009

Guerrilla Boot Camp Training

Registration for my local Guerrilla Boot Camp classes is still open. The first checkpoint is coming up on Feb 26th. That's when we need to notify the hotel whether or not we're going ahead. So if you're thinking of coming, be sure to enroll soon and don't leave it until the last minute.

Overseas attendees are welcome. Check out your exchange rate against the Yankee greenback.

In a related item: Survey says, "Online Instruction is Less Effective Than Classroom Learning."

Apdex Index Examined

This month's edition of the CMG MeasureIT open-access journal has 2 articles on the Apdex Performance Index:
  1. "The Apdex Index Revealed", by yours truly
  2. "The Apdex Index vs Traditional Management Information Decision Tools", by Jim Brady
Jim's article compares the Apdex Index with other well-established management decision techniques; especially those based on statistical methods. As someone with a background in Operations Research, Jim is well placed to make these assessments. It's worth noting that Jim's paper arose out of PARS discussion he and I had at CMG'08 in Las Vegas. PARS stands for Performance Analysts' Relaxation Session (a play on mainframe LPARS) but most CMG-ers just think of it as a "free" food and booze session. ;-) I was relating to Jim that I had attended a couple of CMG presentations which tried to explain the deeper significance of the Apdex Index definitions and there were a few things that really bothered me. I thought he might be able to explain it to me in terms of mathematical statistics. We didn't resolve anything that night (what can you expect with all that booze around?), so I left it with him as homework. His paper is the outcome. You'll have to read my article to find out what was bothering me. :-)

Friday, February 6, 2009

Dr. Dobb's is Dying

My colleague Jim Holtman just informed me that, according to embedded.com, the illustrious software developer magazine Dr. Dobb's Journal will now be embedded(pun intended) in InformationWeek. Both are owned by United Business Media LLC.

Jim and I have been following the series of articles by Herb Sutter on multicore concurrency, starting with the one entitled "Break Amdahl", where he discovers Gustafson's law (see Section 4.3.5 of my GCaP book, Springer 2007). As expected, that title looks slightly optimistic in light of the recently observed lack of scalability on Sandia Labs supercomputers. Of course, the universal law of scalability accounts for all these effects and, unlike Sutter's articles, we are able to quantify them based on actual measurements.

Anyway, as with print newspapers, I suppose all this means that even though Dr. Dobb's is not exactly dead yet, it is getting buried alive.

Sunday, February 1, 2009

NorCal CMG Meeting Location

For those of you who haven't attended before, the Feb 3rd (Tues) meeting of the Northern California CMG will be held in Suite 100 of the Compuware building in Pleasanton, California. Here's the Google map. Three talks will be presented:
  • 9:30--10:30 Mongo Measurement Requires Mongo Capacity Management, Neil Gunther, Performance Dynamics Company
  • 10:45--11:45 Wasted MIPS, Wanton MIPS: a MIPS Recovery Initiative, Tom Halinski, Compuware Corporation
  • 1:00--2:00 The Apdex Index Revealed, Neil Gunther, Performance Dynamics Company

Breakfast starts at 8:30 am and registration is $25 at the door.

Poor Scalability on Multicore Supercomputers

Guerrilla grad Paul P. sent me another gem in which Sandia scientists discover that more core processors don't produce more parallelism on their supercomputer applications:
"16 multicores perform barely as well as 2 for complex applications."

Friday, January 30, 2009

PDQ From SourceForge

Thoughly fed up doing mind-numbing company income taxes for 2008 (Yes, I have to do them earlier than most to get my 1099s out to sub-contractors), I decided to take a break and see if I could compile PDQ (Pretty Damn Quick) by downloading it from our SourceForge project onto my new Macbook.

Tuesday, January 27, 2009

When BRisk Goes Bust

In a nice counterpoint to my previous post on BRisk Management, the latest director of Europe's new atom-smasher (the LHC, which I'll be talking about next week), says he will be more cautious than his predecessor, following the very public failure last September.

An ultra-cold, superconducting magnet on the big ring, collapsed only days after the \$10 billion LHC was switched on and will require more than \$26 million in repairs. With that price tag (even though it's only half what Wall St. paid itself in bonuses last year---for assessing BRisk rather than risk?), the pressure is on for these guys to produce. So, it's not too surprising that the new director said:
"The LHC will be double checked by outside experts before any attempt is made to switch the machine back on, probably in July. I want to be sure that everything works, so I'll also let an external group make additional checks on the accelerator."
Let's compare and contrast with the Bay Bridge situation, shall we? According to the SF Chron:
"...the inspection outfit that sounded the alarm has since been replaced."
In other words, the independent, external inspectors, for the Bay Bridge welds, were terminated for being too pernickety, which would inflate the CalTrans schedule.

Nothing like a good failure to reduce the risk in BRisk. I just hope I'm not on the Bay Bridge when it happens.

Update: To catch up to its 2010 milestones, it has since been decided that the LHC will remain running, continuously, for one year after it's rebooted this summer.

BRisk Management

In my upcoming Guerrilla Boot Camp class, I have a whole bit on Risk Management vs. Risk Perception. The point being that if you, as the performance analyst/capacity planner in your organization, don't appreciate the perspective of your manager, you are going to find yourself very frustrated when certain of your recommendations seem to fall on deaf ears.

Most managers are employed to look after one thing: schedules. If a manager perceives that your performance recommendation could inflate the schedule, it ain't gonna happen (no matter how sane or realistic it might be). I reinforce this perspective by saying:
Managers will let a project fail. As long as it fails on time!
This may sound a bit melodramatic but here is a statement of precisely that type:
"I can understand people being worked up about safety and quality with the welds," said Steve Heminger, executive director ... "But we're concerned about being on schedule because we are racing against the next earthquake."
This is a quote from an executive manager for the new Bay Bridge currently being constructed between Oakland and San Francisco. A section of the upper deck collapsed on this bridge during the Loma Prieta earthquake of 1989.

He's not an IT manager, but he is watching the clock and saying, let's increase the risk that the new bridge will fail (by being brisk about welding inspections), in order to beat the much lower risk that the old bridge might fail again in a quake. Substitute your favorite project, product or application, for the word "bridge" and you get my drift.

Updateof May 2013

The original high-risk Caltrans decision has prompted Gov. Jerry Brown to threaten delaying the scheduled Labor Day opening of the new Bay Bridge span. Erm... so, how did this brisk management decision save time (and money)?

Update of August 16, 2013

The on again, off again, new Bay Bridge opening is on again. As you can probably tell from this KALW piece, there is some skepticism regarding the rationale. [emphasis mine]
"the cracked bolts in the new bridge are apparently better than the totally unsafe old bridge, which wouldn't survive a minor earthquake. ... Experts say the old bridge is extremely unsafe, and won't hold through even a moderate earthquake."
Rubbish! Nothing has really changed significantly on the old bridge structure. This is all about saving political face (and possibly the $20 million contractor bonus). Would I drive the new span? Possibly. But more likely, I'd take BART (via the trans-Bay tube). :)