The Pith of Performance

Wednesday, April 11, 2012

PostgreSQL Scalability Analysis Deconstructed

In 2010, I presented my universal scalability law (USL) at the SURGE conference. I came away with the impression that nobody really understood what I was talking about (quantifying scalability) or, maybe DevOps types thought it was all too hard (math). Since then, however, I've come to find out that people like Baron Schwartz did get it and have since applied the USL to database scalability analysis. Apparently, things have continued to propagate to the point where others have heard about the USL from Baron and are now using it too.

Robert Haas is one of those people and he has applied the USL to Postgres scalability analysis. This is all good news. However, there are plenty of traps for new players and Robert has walked in several of them to the point where, by his own admission, he became confused about what conclusions could be drawn from his USL results. In fact, he analyzed three cases:

PostgreSQL 9.1
PostgreSQL 9.2 with fast locking
PostgreSQL 9.2 current release

I know nothing about Postgres but thankfully, Robert tabulated on his blog the performance data he used and that allows me to deconstruct what he did with the USL. Here, I am only going to review the first of these cases: PostgreSQL 9.1 scalability. I intend to return to the claimed superlinear effects in another blog post.

Sex, Lies and Log Plots

From time to time, at the Hotsos conferences on Oracle performance, I've heard the phrase, "battle against any guess" (BAAG) used in presentations. It captures a good idea: eliminate guesswork from your decision making process. Although that's certainly a laudable goal, life is sometimes not so simple; particularly when it comes to performance analysis. Sometimes, you really can't seem to determine unequivocally what is going on. Inevitably, you are left with nothing but making a guess—preferably an educated guess, not a random guess (the type BAAG wants to eliminate). As I say in one of my Guerrilla mantras: even wrong expectations (or a guess) are better than no expectations. In more scientific terms, such an educated guess is called a hypothesis and it's a major way of making scientific progress.

Of course, it doesn't stop there. The most important part of making an educated guess is testing its validity. That's called hypothesis testing, in scientific circles. To paraphrase the well-known Russian proverb, in contradistinction to BAAG: Guess, but justify^*. Because all hypothesis testing is a difficult process, it can easily get subverted into reaching the wrong conclusion. Therefore, it is extremely important not to set booby traps inadvertently along the way. One of the most common visual booby trap arises from the inappropriate use of logarithmically-scaled axes (hereafter, log axes) when plotting data.

Linear scale:: Each major interval has a common difference $(d)$, e.g., $200, 400, 600, 800, 1000$ if $d=200$:
Log scale:: Each major interval has a common multiple or base $(b)$, e.g., $0.1, 1, 10, 100, 1000$ if $b=10$:

The general property of a log axis is to stretch out the low end of the axis and compress the high end. Notice the unequal minor interval spacings. Hence, using a log scaled axis (either $x$ or $y$) is equivalent to applying a nonlinear transformation to the data. In other words, you should be aware that introducing a log axis will distort the visual representation of the data^†, which can lead to entirely wrong conclusions.

How to Generate Exponential Delays

This question arose while addressing Comments on a previous blog post about exponentially distributed delays. One of my ongoing complaints is that many, if not most, popular load-test generation tools do not provide exponential variates as part of a library of time delays or think-time distributions. Not only is this situation bizarre, given that all load tests are actually performance models (and who doesn't love an exponential distribution in their performance models?), but without the exponential distribution you are less likely to observe such things as buffer overflow conditons due to larger than normal (or uniform) queueing fluctuations. Exponential delays are both simple and useful for that purpose, but we are often left to roll our own code and then debug it.

Plan for Capacity Planning Training in May

Bookings are open for both Guerrilla Boot Camp (GBoot) and Guerrilla Capacity Planning (GCaP) classes in May 2012 at the Early Bird rate.

Entrance Larkspur Landing hotel Pleasanton California

As usual, classes will be held at our lovely Larkspur Landing location. Click on the image for booking information.

Before registering, take a look at some highlights students contributed from previous Guerrilla classes:

You too can be part of that educational experience.

Attendees should bring their laptops, as course materials are provided on CD or flash drive. The venue also offers free wi-fi to the internet.

Wednesday, March 7, 2012

The SSD World Will End in 2024

So says the Non-Volatile Systems Lab at UC San Diego. The claim is, in order to achieve higher densities, flash manufacturers must sacrifice both read and write latencies. I haven't had time to explore this claim in any detail, but I thought it might be useful for you to know about it. Some highlights include:

They tested 45 different NAND flash chips from six vendors that ranged in size from 72 nm circuitry to the current 25nm technology.
They then took their test results and extrapolated them to the year 2024, when NAND flash development road maps show flash circuitry is expected to be only 6.5 nm in size. At that point, read/write latency is expected to increase by a factor of two or more.
They did not use specialized NAND flash controllers such as those used by Intel, OCZ or Fusion-io. Their results can be viewed as "optimistic" because they didn't include latency added through error correction or garbage collection algorithms.
Considering the diminishing returns on performance versus capacity, Grupp said, "it's not going to be viable to go past 6.5 nm ... 2024 is the end."

The technical paper entitled, The Bleak Future of NAND Flash Memory (PDF), was presented and published at the FAST'12 conference held in San Jose, CA on February 14—17, 2012.

Related post: Green Disk Sizing

Friday, February 24, 2012

On the Accuracy of Exponentials and Expositions

The following is a slightly edited version of my response to a Discussion on the Linkedin CPPE group, which is accessible to Members Only. It's written in the style of a journal reviewer. The original Discussion topic was based on a link to a blog-post. I've been asked to make my Linkedin review more widely available so, here tiz...

The blog-post Capacity Planning on a Cocktail Napkin is a really good example of a really bad explanation. There are so many things that are misleading, at best, and flat-out wrong, at worst, it's hard to know where to begin (or where to stop). Nevertheless, I'll try to keep it brief [I failed in that endeavor. — njg].
The author applies the equation:
\begin{equation} E = λ \end{equation}
Why? What is that equation? We don't know because the author does not say yet what all the symbols mean. It's all very well to impress people by slinging equations around, but it's more important to say what the equations actually mean. After all, the author might have chosen the wrong one.

Hotsos Symposium 2012

Time Bandits: How to Analyze Fractal Query Times

Tues, March 6, 2012 @ 2:15 pm

That's the title of my presentation at this year's Hotsos Symposium and no, I won't be trying to make any obscure connections between Terry Gilliam's famous movie and Oracle database products (as interesting as that exercise might be).

Instead, I'll be talking about fractals in time and how they can impact performance—especially Oracle database performance. The responsiveness of your Oracle application can be lost for longer than expected periods of time, ostensibly stolen by time bandits.

Preview Slides (2012). A more detailed explanation of the fractal technique used is now provided in the Guerrilla Data Analytics (GDAT) class: How to Get Beyond Monitoring from Linear Regression to Machine Learning.