The Pith of Performance

Wednesday, July 4, 2012

Characterizing Performance Bottlenecks

If you do a Google search using keywords like: performance, bottleneck, analysis, you get quite a bewildering list of responses, and none of them seems to clearly define what they mean by the term bottleneck.^†

The word bottleneck refers to a choke point or narrowing, literally like the neck of a bottle, that causes the flow to take longer than it would otherwise. The effect on performance is commonly seen on the freeway in an area undergoing roadwork. Multiple lanes of traffic are forced to converge into a single lane and proceed past the roadwork in single file. Going from parallel traffic flow to serial flow means the same number of cars will take longer to get through that same section of road. As we all know, the delay at a freeway bottleneck can be very significant.

The same is true on a single-lane country road. If you come to a section where roadwork slows down every car, it takes longer to traverse that section of the road. Bottlenecks are synonymous with slow downs and delays, but they really determine a lot more than delay.

Is the Turing Test Tough Enough?

In the recent GDAT class, we covered machine learning (ML) applied to performance data analysis and particularly the application of so-called support vector machines. In that section of the course I have to first explain what the word "machine" means in the context of ML. These days the term machine refers to software algorithms, but it has its roots in the development of AI and the history of trying to build machines that can think. That notion of intelligent machines goes back more than sixty years to Alan Turing, who was born a hundred years ago today.

The Turing Test (TT) was introduced as "the imitation game" in Computing Machinery and Intelligence, Mind, Vol. 59, No. 236, pp. 433-460 (1950):

The new form of the problem can be described in terms of a game which we call the "imitation game." It is played with three people, a man (A), a woman (B), and an interrogator (C) who may be of either sex. The interrogator stays in a room apart from the other two. The object of the game for the interrogator is to determine which of the other two is the man and which is the woman. He knows them by labels X and Y, and at the end of the game he says either "X is A and Y is B" or "X is B and Y is A." The interrogator is allowed to put questions to A and B.

Queues in the News

You might not have noticed, but queues have been in the news a lot lately. Not from the standpoint of computer performance but people performance or more accurately, crowd control. Most recently, queues popped up in the context of long delays expected at Heathrow airport due to big crowd arrivals for the London Olympics.

Fears over Heathrow queues during Olympics
The Waiting Game by The Numbers Guy
QANTAS to Get No Queues

In a queue, at least everyone is pointing in the same logical direction. Moreover, if you snake the queue, as they do at Disneyland, people always feel close to the destination and can see it getting even closer as customers ahead of them are processed. That helps to minimize their level of frustration: the control part. For some reason, the post office hasn't figured this out yet.

And, last but not least, queues and computer performance still remain an inevitable perennial. Most recently having to do with the Internet.

Load Testing with Uniform vs. Exponential Arrivals

In a couple of recent blog posts about generating exponential loads and why that is important for load testing and performance testing, it was not completely clear to some readers what was motivating my remarks. In this post, I will try to provide a more visual elaboration of that aspect.

My fundamental point is this. When it comes to load testing^*, presumably the idea is to exercise the system under test (SUT). Otherwise, why are you doing it? Part of exercising the SUT is to produce significant fluctuations in the number of requests residing in application buffers. Those fluctuations can be induced by the pattern of arriving requests issued by the client-side driver (DVR): usually implemented as a pile of PCs or blades.

PostgreSQL Scalability Analysis Deconstructed

In 2010, I presented my universal scalability law (USL) at the SURGE conference. I came away with the impression that nobody really understood what I was talking about (quantifying scalability) or, maybe DevOps types thought it was all too hard (math). Since then, however, I've come to find out that people like Baron Schwartz did get it and have since applied the USL to database scalability analysis. Apparently, things have continued to propagate to the point where others have heard about the USL from Baron and are now using it too.

Robert Haas is one of those people and he has applied the USL to Postgres scalability analysis. This is all good news. However, there are plenty of traps for new players and Robert has walked in several of them to the point where, by his own admission, he became confused about what conclusions could be drawn from his USL results. In fact, he analyzed three cases:

PostgreSQL 9.1
PostgreSQL 9.2 with fast locking
PostgreSQL 9.2 current release

I know nothing about Postgres but thankfully, Robert tabulated on his blog the performance data he used and that allows me to deconstruct what he did with the USL. Here, I am only going to review the first of these cases: PostgreSQL 9.1 scalability. I intend to return to the claimed superlinear effects in another blog post.

Sex, Lies and Log Plots

From time to time, at the Hotsos conferences on Oracle performance, I've heard the phrase, "battle against any guess" (BAAG) used in presentations. It captures a good idea: eliminate guesswork from your decision making process. Although that's certainly a laudable goal, life is sometimes not so simple; particularly when it comes to performance analysis. Sometimes, you really can't seem to determine unequivocally what is going on. Inevitably, you are left with nothing but making a guess—preferably an educated guess, not a random guess (the type BAAG wants to eliminate). As I say in one of my Guerrilla mantras: even wrong expectations (or a guess) are better than no expectations. In more scientific terms, such an educated guess is called a hypothesis and it's a major way of making scientific progress.

Of course, it doesn't stop there. The most important part of making an educated guess is testing its validity. That's called hypothesis testing, in scientific circles. To paraphrase the well-known Russian proverb, in contradistinction to BAAG: Guess, but justify^*. Because all hypothesis testing is a difficult process, it can easily get subverted into reaching the wrong conclusion. Therefore, it is extremely important not to set booby traps inadvertently along the way. One of the most common visual booby trap arises from the inappropriate use of logarithmically-scaled axes (hereafter, log axes) when plotting data.

Linear scale:: Each major interval has a common difference $(d)$, e.g., $200, 400, 600, 800, 1000$ if $d=200$:
Log scale:: Each major interval has a common multiple or base $(b)$, e.g., $0.1, 1, 10, 100, 1000$ if $b=10$:

The general property of a log axis is to stretch out the low end of the axis and compress the high end. Notice the unequal minor interval spacings. Hence, using a log scaled axis (either $x$ or $y$) is equivalent to applying a nonlinear transformation to the data. In other words, you should be aware that introducing a log axis will distort the visual representation of the data^†, which can lead to entirely wrong conclusions.

How to Generate Exponential Delays

This question arose while addressing Comments on a previous blog post about exponentially distributed delays. One of my ongoing complaints is that many, if not most, popular load-test generation tools do not provide exponential variates as part of a library of time delays or think-time distributions. Not only is this situation bizarre, given that all load tests are actually performance models (and who doesn't love an exponential distribution in their performance models?), but without the exponential distribution you are less likely to observe such things as buffer overflow conditons due to larger than normal (or uniform) queueing fluctuations. Exponential delays are both simple and useful for that purpose, but we are often left to roll our own code and then debug it.