Friday, February 27, 2009

Plotting PDQ Output with R

One the nice things about PDQ-R (coming in release 5.0) is the ability to plot PDQ output directly in R. Here's a PDQ-R script, together with the corresponding graphical output, that I knocked up to show the effect on the throughput curve of adding more queueing delay stages (K), with everything else held constant.


With just a single queue (K = 1) the system saturates very quickly. The throughput curve shoots up the y-axis until it hits the ceiling at X = 2.0 requests/per-unit-time. Consequently, the linear rising slope on the early part of the throughput curve is almost indistinguishable from the optimal load-line at N* = 1.016 clients. This rapid saturation effect is less pronounced in a system with more queues because there are more service stages and completion therefore takes longer. But it requires a considerable number of additional queueing centers to get a noticeable difference, e.g., K = 20, 50. Observe also that the optimal load-line moves to the right and is positioned on the x-axis at a value very close to K. I'll let you ponder why that must be true.

The plot also explains the rationale for the approach I took in Chap. 10 of the Perl PDQ book where I modeled the scalability measurements of a multi-tier web application. In addition to the measured tiers, I ended up introducing 12 "dummy" queues in order to produce the correct round-trip latency, whilst retaining Z = 0 think time in accord with the original web application test scripts. The stunningly powerful conclusion was that there must've been additional latencies that were not included in the original measurements on the test rig. Otherwise, the data that were measured could not be reconciled with each other. Although I couldn't determine what the sources of those hidden latencies were, I could state quite categorically that they were real. You cannot possibly reach this kind of penetrating conclusion without a performance model. Data comes from the Devil, models come from God.

I didn't include the corresponding plots showing the effect of the dummy queues (similar to the above) in my Perl PDQ book because it was so tedious to write the data out to a file and then import it into Excel (which is what I was using back then). With PDQ-R, it's a snap to do it in about 50 lines.

2 comments:

Michael Morse said...

Need Help Getting PDQ-R to work on windows.
"Get NUll Message on function Calls"
Everything is installed correctly.
After loading the PDQ package and executing the library(pdq)
When I execute Init("Test) it results in a NULL message response.
Why is it reporting NULL? Seems like it is not working because most of the command get a NULL response inlcuing Solve(CANON) and Report()

Neil Gunther said...

The appropriate place to get help with PDQ functionality problems and bugs is over here ... http://sourceforge.net/projects/pdq-qnm-pkg/forums/forum/737917