Tuesday, August 18, 2009

Response Time Knees and Queues

How do you determine where the response-time "knee" occurs? This is a question one commonly hears with reference to characterizing the performance of an application. Calculating where the response time suddenly begins to climb dramatically is considered, by many, to be an important determinant for such things as load testing, scalability analysis, and setting application service targets.

In a previous blog post, I pointed out that such a "knee" is actually an optical illusion. Nonetheless, this same question arose in last month's CMG MeasureIT, as a kind of survey entitled "Does the Knee in a Queuing Curve Exist or is it just a Myth?" Although that author concludes (correctly) that the existence of a "knee" (as it is usually meant) is bogus, the panoply of responses was quite astounding—especially coming from professionals who ought to know better. In this month's MeasureIT, I examine the same question in a rigorous but unconventional way under the title "Mind Your Knees and Queues: Responding to Hyperbole with Hyperbol√¶."


metasoft said...

thank for the article with the full detail.

i wanted to confirm the points you are making since you didn't summarized it at the end of the 15 page article:

1. knee is not a precise term; it is usually used to refer to the existence of points of diminishing returns. use optimal instead.

2. for open queue, there is no fixed optimal

3. for closed queue, there is a fixed optimal

4. determine which of type of queue you are dealing with to determine applicability of questions, answers, or plots.

Neil Gunther said...

Good points. I probably should've provided a better summary. Let's try here:

1. The "knee" is more precisely a discontinuity in the gradient of the function. In the theoretical performance curves, it is associated with the asymptotes. However, and I didn't mention this deliberately, you can also see gradient discontinuities in measured data (as opposed to the asymptotes). There, it implies the dynamics of the system has changed dramatically. I didn't want to confuse the original CMG discussion by getting into that. I actually plan a separate blog post on that topic.

2. There is no *definable* optimum on the curve. There is an associated knee belonging to the D/D/1 limit, but you can be never reach it because the queue goes unstable before you get there.

3. There is a *definable* optimum on the curve, but it's complicated mathematically, so we settle for the close approximation of the knee at Nopt. Not only can that point be reached (along the x-axis) it can be exceeded w/o the system becoming unstable. It goes to infinity, but more slowly along the hockey-stick handle.

4. I suppose this statement is about setting expectations---which is always a good idea: Guerrilla mantra 1.11. Then, decide if the actual system (SUT) is open (e.g., Internet) or closed (e.g., load test); in the queueing theory sense. That tells you immediately what to expect from the measured data for both throughput (which I didn't discuss in the CMG article because response time was the topic) and response times. Example: If someone showed me load-test or benchmark data (closed) for response times, but it looked like an M/M/m curve (open), they would be wishing they were somewhere else.