Sunday, January 22, 2012

Throughput-Delay Curves

A colleague of mine at Yahoo.com asked me if I'd ever seen curves like this:

Not only is the answer, yes (it's a throughput-delay plot or XR plot in my notation), but that particular plot comes from my GCaP course notes. There, I use it to analyze the comparative performance of a functional multiprocessor (NS6000) and a symmetric multiprocessor (SC2000). Note how the two curves cross at around 1500 OPS. You can ask yourself why and if you can't come up with an explanation, you should be registering for a Guerrilla class. :)

The above XR plot also serves as a useful reminder that the throughput and response-time metrics are not only dependent on one another, but they are generally dependent in a nonlinear way—despite what some experts may claim:

From xxxx@cisco.com
Mon, 16 Sep 2002 08:04:04 +1000
...in reality, Latency & Throughput are two completely different metrics associated with performance. Latency isn't related to Throughput and vice-versa. ...

What is truly alarming, however, is that I am unable to find this example in any of my books; after a quick and non-exhaustive search. I would have expected it was discussed in The Practical Performance Analyst (since the example is of that vintage), but I don't see it. If you happen to spot it or know which book chapter it is in, please let me know.

Since the throughput-delay product is merely a statement of Little's law, N = X*R, you can use it to trivially calculate the number of thingies (N) at any point on the XR plot. Then, you can plot say, throughput as a function of N, i.e., X = X(N), which you weren't given in the original visual representation. For the above example, N is the number of client generators impressing load on the test rig:

Now you can see clearly that the SC2000 throughput has reached a saturation value of about 2500 OPS by 100 generators. So, there would be no virtue in driving the NFS system with more load generators (even if there is NIC bandwidth available) because the response time will be begin to climb embarrassingly.

Although the above example is dated in terms of the technology, it is timeless in terms of its instruction. With that in mind, you can review more modern XR plots on the SPEC.org website where the current NFS benchmark is called SFS2008. Here are some SPEC throughput-delay plots from 2011:

Packet pushers also regard the bandwidth-delay product as very important because it:

  • "determines the amount of data that can be in transit in the network"
  • [is a] "very important concept in a window-based protocol such as TCP, as throughput is bound by" [it]
  • can be conveniently calculated for TCP packets
Cautionary note: Be careful to distinguish between the (band)width of the network pipe, i.e., Xmax, vs. the number of packets per second, i.e., X, being pushed through that pipe.

14 comments:

Adrian Cockcroft said...

They are my favorite plots - I put a histogram on each axis and call it a Cockcroft Headroom Plot. R code is on perfcap.blogspot.com - search for chp

I also got AppDynamics to add throughput vs response time plots to their product. More tools should have them...

Baron said...

I've been meaning to ask Neil what math is involved in the relationship between this and Erlang C. It's trivial to flip the USL on its head with Little's Law and come up with a plot of throughput versus response time, which looks a lot like the hockey stick curve you get from the Erlang C function. I've got some scripts that cobble all this together with Awk and gnuplot; I need to use R instead.

But I assume (haven't even tried to verify) that the inverted USL and Erlang C are not the same thing. So if we took Erlang C and did a somersault, then what would we get for a scalability model? Anything useful and interesting?

metasoft said...

In you book, Analyzing Computer Systerm Performance with Perl::PDQ, 1st ed., there is an XR plot on page 94. Page 103, 104 and 105 you have the rho-R/S plots.

Neil Gunther said...

Thanks, Meta.

From the 1st edn: "Some authors e.g., p. 241 {Splaine} combine the throughput and response time data together in the same X versus R plot
(Fig. 2.22). These throughput-delay plots also have a convex shape. This type of representation may be compact but it also can be misleading because it resembles the response time in Fig. 2.17 for an unlimited request queue. This can act as a visual miscue. Moreover, it is not very useful from the standpoint of assessing performance bounds and optimal loads---topics we discuss in Chap. 5."

So, I do cover the topic, but I was hoping that I had included the SPEC example as well. Seems not in either PPDQ1 or PPDQ2.

The plots on pp. 103 ff. are not quite the same thing since the x-axis is normalized to 1 by virtue of Little's law. In other words, the x-axis there is X*S, which is not the same thing as X alone.

Neil Gunther said...

Baron, You are completely correct: the USL model (inverted or otherwise) has nothing to do with the Erlang-C function (in the sense of M/M/m queue). This should be pretty obvious when you consider the software version of the USL. That involves a finite number N of processes or threads or whatever, whereas an M/M/m queue can have an unlimited (but not infinite) number of processes.

Similarly, as you've observed, the "inverted" USL (don't forget Z) exhibits a "hockey stick" response-time curve. Here, "hockey stick" literally means the "handle" appears (asymptotically) to be linear with a finite positive slope.

The corresponding limit for an M/M/m response-time curve has infinite slope.

Mohan Radhakrishnan said...

I have a plan to read your book because we want to generate workloads for netty NIO servers. I've searched the net too.

Are there recommendations or reading material to plan and write workload generators ? Is it in your book ?
An example is https://github.com/brianfrankcooper/YCSB/wiki

Mohan

Mohan Radhakrishnan said...

I have a plan to read your book because we want to generate workloads for netty NIO servers. I've searched the net too.

Are there recommendations or reading material to plan and write workload generators ? Is it in your book ?
An example is https://github.com/brianfrankcooper/YCSB/wiki

Mohan

Mohan Radhakrishnan said...

I have a plan to read your book because we want to generate workloads for netty NIO servers. I've searched the net too.

Are there recommendations or reading material to plan and write workload generators ? Is it in your book ?
An example is https://github.com/brianfrankcooper/YCSB/wiki

Mohan

Neil Gunther said...

This question is really about creating a *model* workload or workloads. The answer is as conceptually difficult as specifying a TPC benchmark (viz., 100 page doc)

It depends on:
* the app
* the test rig
* how all that compares with prod
* what are the performance test objectives
just to name a few.

Some/many of these points are certainly covered in the Yahoo slides you pointed me at.

I would suggest that my books are relevant on either end of the workload modeling process:
* what are the perf objectives going in
* what info is the data trying to tell you coming out

I wouldn't agree with everything that is in those Y! slides either. For example, "performance" vs "scalability" is a misnomer, in my view.

If we consider things from the USL perspective http://www.perfdynamics.com/Manifesto/USLscalability.html
* "performance" == scalability as a function of N processES (e.g., LR)
* "scalability" == scalability as a function of p processORS

In other words, they are 2 sides of the same coin.

Mohan Radhakrishnan said...

>I would suggest that my books are >relevant on either end of the >workload modeling process:
>* what are the perf objectives >going in
>* what info is the data trying to >tell you coming out

Ok. Let's say I want to generate a uniform random distribution of values between two bounds and pipe it to a socket program. I think it involves some statistics ? This kind of workload generation is what is described in the PDQ book ? Initially I thought I should get a statistics book ? What are your thoughts ?

I am just gathering a few ideas before buying some books.

Mohan Radhakrishnan said...

>I would suggest that my books are >relevant on either end of the >workload modeling process:
>* what are the perf objectives >going in
>* what info is the data trying to >tell you coming out

Ok. Let's say I want to generate a uniform random distribution of values between two bounds and pipe it to a socket program. I think it involves some statistics ? This kind of workload generation is what is described in the PDQ book ? Initially I thought I should get a statistics book ? What are your thoughts ?

I am just gathering a few ideas before buying some books.

Mohan Radhakrishnan said...

This sentence should read

This kind of workload generation is what is *also* described in the PDQ book based on modeling and measurement ?

Neil Gunther said...

Mohan, If I understand your question, you don't need a statistics book to use an RNG (or pseudo-RNG, more accurately). That's really more about computational arithmetic than statistics.

You can either use the inbuilt RNG that belongs to the computer language you are using, e.g., rand() in C and Perl, or you can roll your own.

An example of the latter is given in the listing on p. 35 of my Perl::PDQ book. It's the subroutine called rand_num, which returns values in the range 0.0 to 1.0, but you could alter the range to be something more convenient.

Mohan Radhakrishnan said...

I am using RNG in Java. It is just that after looking at http://perfdynamics.blogspot.in/2010/05/emulating-internet-traffic-in-load.html and http://perfdynamics.blogspot.in/2010/05/load-testing-think-time-distributions.html I realized I am missing the concepts. So I thought a book like Lyman Ott's introductory book might help.