Image made with Mathematica 7.0 for Mac OS X x86 (64-bit).
Possibly pithy insights into computer performance analysis and capacity planning based on the Guerrilla series of books and training classes provided by Performance Dynamics Company.
Image made with Mathematica 7.0 for Mac OS X x86 (64-bit).
Operationally, PDQ, in any of the supported languages, should appear cosmetically the same as Release 5.0; no additional programming required. Since the PDQ-R source can be compiled separately, this release will be of special interest to Microsoft Windows users.
If you're new to PDQ, here's a simple PDQ-R model you can paste directly into the R-console:
library(pdq)
# input parameters
arrivalRate <- 0.75
serviceRate <- 1.0
## Build and solve the PDQ model
Init("Single queue model") # initialize PDQ
CreateOpen("Work", arrivalRate) # open workflow
CreateNode("Server", CEN, FCFS) # single server
SetDemand("Server", "Work", 1/serviceRate) # service time
Solve(CANON) # solve the model
Report() # tabulated output
Please see the online release notes for the download links and more detailed information, as well as the top-level README file in the distribution. Beyond that, check out the relevant books and training classes.
Merry Xmas from the PDQ Dev Team!
This is a guest post by Paul Puglia, who contibuted significant development effort for the PDQ 6.0 release; especially as it relates to interfacing with R. Here, Paul provides more details about the motivation described in my eariler announcement.
PDQ was designed and implemented around a couple of basic assumptions. First, the library would be a C-language API running on some variant of the Unix operating system where we could reasonably assume that we'd be able to link it against a standard C library. Second, programs built using this API would be "stand-alone" executables in the sense that they'd run in their own, dedicated memory address spaces, could route its I/O through the standard streams (stdout or stderr), and had complete control over how error conditions would be handled.
Not surprisingly, the above assumptions drove a set of implementation decisions for the library, namely:
With the arrival of PDQ 2.0, we introduced foreign interfaces programming environments (PERL, Python and R) that allowed PDQ to be called from these other environments. All these new foreign interfaces were built and released using the SWIG interface building tool, which allows us to build these interfaces with absolutely no modification to the underlying PDQ code—a major benefit when you’ve got a mature, debugged API that you really want to remain that way. For the most part this arrangement worked pretty well—at least for those environments where it was natural to write and execute PDQ models like standalone C-programs (you can also read this as PERL and Python).
When it came to R, however, our early implementation decisions weren’t such a great fit for how R is commonly used, which is as an interactive environment, similar to programs like Mathematica, Maple, and Matlab. Like these other environments, R users do most of their interaction with a REPL (Read-Execute-Print Loop) usually wrapped in either full-fledged GUI interface or a terminal-like interface called the console.
It turns out that most of PDQ's implementation decisions could (and do) interfere with using R interactively. In particular:
Not only do these severely degrade the interactive experience for R users, their use also gets flagged by R’s extension building mechanism when it does a consistency check. And not passing that check would prove a major impediment for getting the PDQ's R interface accepted on CRAN (Comprehensive R Archive Network).
Luckily, none of the fixes for these issues are particularly hard to implement. Most are either fairly simple substitutions of the R API calls for C library routines or/and localized changes to PDQ library. And, while all of this does potentially create a risk of introducing bugs in the PDQ library, the reward for taking that risk is a stable R interface that can be eventually be submitted to CRAN. A version of the PDQ library can be easily built under Windows™ using the Rtools utilities.
R version 2.15.2 (2012-10-26) -- "Trick or Treat"
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: i386-apple-darwin9.8.0/i386 (32-bit)
> library(pdq)
> source("/Users/njg/PDQ/Test Suites/R-Test/mm1.r")
***************************************
****** Pretty Damn Quick REPORT *******
***************************************
*** of : Thu Nov 8 17:42:48 2012 ***
*** for: M/M/1 Test ***
*** Ver: PDQ Analyzer 6.0b 041112 ***
***************************************
***************************************
...
The main trick is that the Perl and Python versions of PDQ will remain entirely unchanged while at the same time invisibly incorporating significant changes to accommodate R.
More recently, it was brought to my attention that the USL fails when it comes to modeling superlinear performance (e.g., see this Comments section). Superlinear scalability means you get more throughput than the available capacity would be expected to support. It's even discussed on the Wikipedia (so it must be true, right?). Nice stuff, if you can get it. But it also smacks of an effect like perpetual motion.
Every so often, you see a news report about someone discovering (again) how to beat the law of conservation of energy. They will swear up and down that it works and it will be accompanied by a contraption that proves it works. Seeing is believing, after all. The hard part is not whether to believe their claim, it's debugging their contraption to find the mistake that has led them to the wrong conclusion.
Similarly with superlinearity. Some data are just plain spurious. In other cases, however, certain superlinear measurements do appear to be correct, in that they are repeatable and not easily explained away. In that case, it was assumed that the USL needed to be corrected to accommodate superlinearity by introducing a third modeling parameter. This is bad news for many reasons, but primarily because it would weaken the universality of the universal scalability law.
To my great surprise, however, I eventually discovered that the USL can accommodate superlinear data without any modification to the equation. As an unexpected benefit, the USL also warns you that you're modeling an unphysical effect: like a perpetual-motion detector. A corollary of this new analysis is the existence of a payback penalty for incurring superlinear scalability. You can think of this as a mathematical statement of the old adage: If it looks too good to be true, it probably is.
I'll demonstrate this remarkable result with examples in my Hotsos presentation.
The small irony here is that I refer to the London Tube map in my GCaP classes as a paradigm for performance models:
2.4 More Like The Map Than The MetroWhen I was writing the GCaP book, I asked the London tube authority for permission to use their classically dense map. In incredible bout of British bureaucratic officiousness, they offered the tube map at £100 per impression—an offer my publisher was only too keen to refuse. Hence, the map you see on p. 9 of the GCaP book is that of BART, an authority who were pleased to see it further advertised by a taxpayer, as long as it was sourced—an offer my publisher was only too happy abide by.
Here's some of what went down:
As a performance analyst or capacity planner, you already know all about Little's law—it's elementary. Right? Therefore, you completely understand:
If you're feeling slightly bewildered about all this, you really should come along to my talk (assuming you're in the area). Otherwise, you can read the slide deck embedded below.
I'll show you how I discovered the resolution to the apparent contradiction between items 7 and 8 (above) by representing Little's law in 3-dimensions. It's very cool! Even John Little doesn't know about this.
Oh yeah, and I'll also explain how Little's law reveals why it's possible to make your application IOs go 10x to 100x faster. IOPS bandwidth has become irrelevant.
Some of these conclusions are based on recent work I've been doing for Fusion-io. You might've heard of their billion IOPS benchmark, and more recently by association with SSDAlloc software from Princeton University.
The word bottleneck refers to a choke point or narrowing, literally like the neck of a bottle, that causes the flow to take longer than it would otherwise. The effect on performance is commonly seen on the freeway in an area undergoing roadwork. Multiple lanes of traffic are forced to converge into a single lane and proceed past the roadwork in single file. Going from parallel traffic flow to serial flow means the same number of cars will take longer to get through that same section of road. As we all know, the delay at a freeway bottleneck can be very significant.
The same is true on a single-lane country road. If you come to a section where roadwork slows down every car, it takes longer to traverse that section of the road. Bottlenecks are synonymous with slow downs and delays, but they really determine a lot more than delay.
The Turing Test (TT) was introduced as "the imitation game" in Computing Machinery and Intelligence, Mind, Vol. 59, No. 236, pp. 433-460 (1950):
The new form of the problem can be described in terms of a game which we call the "imitation game." It is played with three people, a man (A), a woman (B), and an interrogator (C) who may be of either sex. The interrogator stays in a room apart from the other two. The object of the game for the interrogator is to determine which of the other two is the man and which is the woman. He knows them by labels X and Y, and at the end of the game he says either "X is A and Y is B" or "X is B and Y is A." The interrogator is allowed to put questions to A and B.
And, last but not least, queues and computer performance still remain an inevitable perennial. Most recently having to do with the Internet.
My fundamental point is this. When it comes to load testing*, presumably the idea is to exercise the system under test (SUT). Otherwise, why are you doing it? Part of exercising the SUT is to produce significant fluctuations in the number of requests residing in application buffers. Those fluctuations can be induced by the pattern of arriving requests issued by the client-side driver (DVR): usually implemented as a pile of PCs or blades.
Robert Haas is one of those people and he has applied the USL to Postgres scalability analysis. This is all good news. However, there are plenty of traps for new players and Robert has walked in several of them to the point where, by his own admission, he became confused about what conclusions could be drawn from his USL results. In fact, he analyzed three cases:
Of course, it doesn't stop there. The most important part of making an educated guess is testing its validity. That's called hypothesis testing, in scientific circles. To paraphrase the well-known Russian proverb, in contradistinction to BAAG: Guess, but justify*. Because all hypothesis testing is a difficult process, it can easily get subverted into reaching the wrong conclusion. Therefore, it is extremely important not to set booby traps inadvertently along the way. One of the most common visual booby trap arises from the inappropriate use of logarithmically-scaled axes (hereafter, log axes) when plotting data.
Before registering, take a look at some highlights students contributed from previous Guerrilla classes:
You too can be part of that educational experience.Attendees should bring their laptops, as course materials are provided on CD or flash drive. The venue also offers free wi-fi to the internet.
The technical paper entitled, The Bleak Future of NAND Flash Memory (PDF), was presented and published at the FAST'12 conference held in San Jose, CA on February 14—17, 2012.
Related post: Green Disk Sizing
The blog-post Capacity Planning on a Cocktail Napkin is a really good example of a really bad explanation. There are so many things that are misleading, at best, and flat-out wrong, at worst, it's hard to know where to begin (or where to stop). Nevertheless, I'll try to keep it brief [I failed in that endeavor. — njg].The author applies the equation:
\begin{equation} E = λ \end{equation}
Why? What is that equation? We don't know because the author does not say yet what all the symbols mean. It's all very well to impress people by slinging equations around, but it's more important to say what the equations actually mean. After all, the author might have chosen the wrong one.
Time Bandits: How to Analyze Fractal Query Times Tues, March 6, 2012 @ 2:15 pm
That's the title of my presentation at this year's Hotsos Symposium and no, I won't be trying to make any obscure connections between Terry Gilliam's famous movie and Oracle database products (as interesting as that exercise might be).
Instead, I'll be talking about fractals in time and how they can impact performance—especially Oracle database performance. The responsiveness of your Oracle application can be lost for longer than expected periods of time, ostensibly stolen by time bandits.
Preview Slides (2012). A more detailed explanation of the fractal technique used is now provided in the Guerrilla Data Analytics (GDAT) class: How to Get Beyond Monitoring from Linear Regression to Machine Learning.
where Nplatters is the number of platters on the spindle, Ω is the rotational speed in revolutions per minute (RPM) and D the platter diameter in inches. The power consumed is then measured in Watts.
In principle, this makes (1) valuable for doing green HDD storage capacity planning. The bad news is, it is not in the form of an equation but a statement of proportionality, so it can't be used to calculate anything as it stands. More on that shortly. The good news is that all of the quantities in (1) can be read off from the data sheet of the respective disk vendor†. Note that the disk capacity, e.g., GB (the usual capacity planning metric) does not appear in (1).
The outstanding question is: where do those funny non-integral exponents come from?
Not only is the answer, yes (it's a throughput-delay plot or XR plot in my notation), but that particular plot comes from my GCaP course notes. There, I use it to analyze the comparative performance of a functional multiprocessor (NS6000) and a symmetric multiprocessor (SC2000). Note how the two curves cross at around 1500 OPS. You can ask yourself why and if you can't come up with an explanation, you should be registering for a Guerrilla class. :)
The above XR plot also serves as a useful reminder that the throughput and response-time metrics are not only dependent on one another, but they are generally dependent in a nonlinear way—despite what some experts may claim: