The Pith of Performance: PostgreSQL

Tuesday, November 6, 2012

Hotsos 2013: Superlinear Scalability

As readers of this blog know, the Universal Scalability Law (USL) is a framework for quantifying performance measurements and extrapolating load-test data. Applied as a statistical regression model, the two USL contention (α) and coherency (β) parameters numerically indicate the degree of sublinear scalability in the data, i.e., how much linear scaling you're losing due to sharing and consistency overheads. Some examples of USL scalability analysis applied to databases, include:

VoltDB
Postgres
My USL analysis of the Postgres data

More recently, it was brought to my attention that the USL fails when it comes to modeling superlinear performance (e.g., see this Comments section). Superlinear scalability means you get more throughput than the available capacity would be expected to support. It's even discussed on the Wikipedia (so it must be true, right?). Nice stuff, if you can get it. But it also smacks of an effect like perpetual motion.

Every so often, you see a news report about someone discovering (again) how to beat the law of conservation of energy. They will swear up and down that it works and it will be accompanied by a contraption that proves it works. Seeing is believing, after all. The hard part is not whether to believe their claim, it's debugging their contraption to find the mistake that has led them to the wrong conclusion.

Similarly with superlinearity. Some data are just plain spurious. In other cases, however, certain superlinear measurements do appear to be correct, in that they are repeatable and not easily explained away. In that case, it was assumed that the USL needed to be corrected to accommodate superlinearity by introducing a third modeling parameter. This is bad news for many reasons, but primarily because it would weaken the universality of the universal scalability law.

To my great surprise, however, I eventually discovered that the USL can accommodate superlinear data without any modification to the equation. As an unexpected benefit, the USL also warns you that you're modeling an unphysical effect: like a perpetual-motion detector. A corollary of this new analysis is the existence of a payback penalty for incurring superlinear scalability. You can think of this as a mathematical statement of the old adage: If it looks too good to be true, it probably is.

I'll demonstrate this remarkable result with examples in my Hotsos presentation.

Wednesday, April 11, 2012

PostgreSQL Scalability Analysis Deconstructed

In 2010, I presented my universal scalability law (USL) at the SURGE conference. I came away with the impression that nobody really understood what I was talking about (quantifying scalability) or, maybe DevOps types thought it was all too hard (math). Since then, however, I've come to find out that people like Baron Schwartz did get it and have since applied the USL to database scalability analysis. Apparently, things have continued to propagate to the point where others have heard about the USL from Baron and are now using it too.

Robert Haas is one of those people and he has applied the USL to Postgres scalability analysis. This is all good news. However, there are plenty of traps for new players and Robert has walked in several of them to the point where, by his own admission, he became confused about what conclusions could be drawn from his USL results. In fact, he analyzed three cases:

PostgreSQL 9.1
PostgreSQL 9.2 with fast locking
PostgreSQL 9.2 current release

I know nothing about Postgres but thankfully, Robert tabulated on his blog the performance data he used and that allows me to deconstruct what he did with the USL. Here, I am only going to review the first of these cases: PostgreSQL 9.1 scalability. I intend to return to the claimed superlinear effects in another blog post.