The Pith of Performance: response time

Showing posts with label response time. Show all posts

Thursday, July 28, 2016

Erlang Redux Resolved! (This time for real)

As I show in my Perl::PDQ book, the residence time at an M/M/1 queue is trivial to derive and (unlike most queueing theory texts) does not require any probability theory arguments. Great for Guerrillas! However, by simply adding another server (i.e., M/M/2), that same Guerrilla approach falls apart. This situation has always bothered me profoundly and on several occasions I thought I saw how to get to the exact formula—the Erlang C formula—Guerrilla style. But, on later review, I always found something wrong.

Although I've certainly had correct pieces of the puzzle, at various times, I could never get everything to fit in a completely consistent way. No matter how creative I got, I always found a fly in the ointment. The best I had been able to come up with is what I call the "morphing model" approximation where you start out with $m$ parallel queues at low loads and it morphs into a single $m$-times faster M/M/1 queue at high loads.

That model is also exact for $m = 2$ servers—which is some kind of progress, but not much. Consequently, despite a few misplaced enthusiastic announcements in the past, I've never been able to publish the fully corrected morphing model.

Hockey Elbow and Other Response Time Injuries

You've heard of tennis elbow. Well, there's a non-sports, performance injury that I like to call hockey elbow. An example of such an "injury" is shown in Figure 1, which appeared in a recent computer performance analysis presentation. It's a reminder of how easy it is to become complacent when doing performance analysis and possibly end up reaching the wrong conclusion.

Figure 1. injured response time performance

Figure 1 is seriously flawed for two reasons:

It incorrectly shows the response time curve with a vertical asymptote.
It compounds the first error by employing a logarithmic x-axis.

Response Time Percentiles for Multi-server Applications

In a previous post, I applied my rules-of-thumb for response time (RT) percentiles (or more accurately, residence time in queueing theory parlance), viz., 80th percentile: $R_{80}$, 90th percentile: $R_{90}$ and 95th percentile: $R_{95}$ to a cellphone application and found that the performance measurements were not completely consistent. Since the relevant data only appeared in a journal blog, I didn't have enough information to resolve the discrepancy; which is ok. The first job of the performance analyst is to flag performance anomalies but most probably let others resolve them—after all, I didn't build the system or collect the measurements.

More importantly, that analysis was for a single server application (viz., time-to-first-fix latency). At the end of my post, I hinted at adding percentiles to PDQ for multi-server applications. Here, I present the corresponding rules-of-thumb for the more ubiquitous multi-server or multi-core case.

Single-server Percentiles

First, let's summarize the Guerrilla rules-of-thumb for single-server percentiles (M/M/1 in queueing parlance): \begin{align} R_{1,80} &\simeq \dfrac{5}{3} \, R_{1} \label{eqn:mm1r80}\\ R_{1,90} &\simeq \dfrac{7}{3} \, R_{1}\\ R_{1,95} &\simeq \dfrac{9}{3} \, R_{1} \label{eqn:mm1r95} \end{align} where $R_{1}$ is the statistical mean of the measured or calculated RT and $\simeq$ denotes approximately equal. A useful mnemonic device is to notice the numerical pattern for the fractions. All denominators are 3 and the numerators are successive odd numbers starting with 5.

Adding Percentiles to PDQ

Pretty Damn Quick (PDQ) performs a mean value analysis of queueing network models: mean values in; mean values out. By mean, I mean statistical mean or average. Mean input values include such queueing metrics as service times and arrival rates. These could be sample means. Mean output values include such queueing metrics as waiting time and queue length. These are computed means based on a known distribution. I'll say more about exactly what distribution, shortly. Sometimes you might also want to report measures of dispersion about those mean values, e.g., the 90th or 95th percentiles.

Percentile Rules of Thumb

In The Practical Performance Analyst (1998, 2000) and Analyzing Computer System Performance with Perl::PDQ (2011), I offer the following Guerrilla rules of thumb for percentiles, based on a mean residence time R:

80th percentile: p80 ≃ 5R/3
90th percentile: p90 ≃ 7R/3
95th percentile: p95 ≃ 9R/3

I could also add the 50th percentile or median: p50 ≃ 2R/3, which I hadn't thought of until I was putting this blog post together.

Q-Q Plots and Power Laws in Database Performance Data

I'm in the process of putting together some slides on how to apply Quantile-Quantile plots to performance data. Q-Q plots are a handy tool for visually inspecting how well your data matches a known probability distribution (prob dsn). If the match is good, the data should line up more or less diagonally in the Q-Q plot. A common usage is to verify normality, i.e. how well the data matches a Normal or Gaussian dsn. In fact, this usage is so common that R even has a separate function called qqnorm() for doing just that.

Response Time Knees and Queues

How do you determine where the response-time "knee" occurs? This is a question one commonly hears with reference to characterizing the performance of an application. Calculating where the response time suddenly begins to climb dramatically is considered, by many, to be an important determinant for such things as load testing, scalability analysis, and setting application service targets.

In a previous blog post, I pointed out that such a "knee" is actually an optical illusion. Nonetheless, this same question arose in last month's CMG MeasureIT, as a kind of survey entitled "Does the Knee in a Queuing Curve Exist or is it just a Myth?" Although that author concludes (correctly) that the existence of a "knee" (as it is usually meant) is bogus, the panoply of responses was quite astounding—especially coming from professionals who ought to know better. In this month's MeasureIT, I examine the same question in a rigorous but unconventional way under the title "Mind Your Knees and Queues: Responding to Hyperbole with Hyperbolæ."

Saturday, March 8, 2008

Watch Your Knees and Queues

Beware of optical illusions!

The above plot, showing the normalized response times (R/S) for an M/M/m queue (i.e., a single waiting line with m servers), popped up several times at Hotsos 2008. The M/M/m queue can be employed to model the performance of multiple Oracle processes. Here, the curves correspond to m = 1 (black), 2, 3, 9, 16 (blue) plotted against average server utilization.

Leistungsdiagnostik - Load Averages and Stretch Factors

My latest article for the German Linux-Magazin has just appeared in the August edition under the title "Leistungsdiagnostik". The abstract reads:

Shellkommandos wie »uptime« werfen stets drei Zahlen als Load Average aus. Allerdings wissen nur wenige, wie sie zustande kommen und was genau sie bedeuten. Dieser Beitrag klärt darüber auf und stellt zugleich mit dem Stretchfaktor eine Erweiterung vor.

The main theme is about how to extend absolute load averages to relative stretch factor values.

Wednesday, April 18, 2007

How Long Should My Queue Be?

A simple question; there should be a simple answer, right? Guerrilla alumus Sudarsan Kannan asked me if a rule-of-thumb could be constructed for quantitatively assessing the load average on both dual-core and multicore platforms. He had seen various remarks, from time to time, alluding to optimal load averages.

The Pith of Performance