The details concerning how you can do this kind of cost-benefit analysis for your cloud applications will be discussed in the upcoming GCAP class and the PDQW workshop. Check the corresponding class registration pages for dates and pricing.
Possibly pithy insights into computer performance analysis and capacity planning based on the Guerrilla series of books and training classes provided by Performance Dynamics Company.
The details concerning how you can do this kind of cost-benefit analysis for your cloud applications will be discussed in the upcoming GCAP class and the PDQW workshop. Check the corresponding class registration pages for dates and pricing.
Exposing the Cost of Performance
Hidden in the Cloud
Neil Gunther
Performance Dynamics, Castro Valley, California
Mohit Chawla
Independent Systems Engineer, Hamburg, Germany10am Pacific Time on June 19, 2018
Whilst offering lift-and-shift migration and versatile elastic capacity, the cloud also reintroduces an old mainframe concept—chargeback—which rejuvenates the need for performance analysis and capacity planning. Combining production JMX data with an appropriate performance model, we show how to assess fee-based EC2 configurations for a mobile-user application running on a Linux-hosted Tomcat cluster. The performance model also facilitates ongoing cost-benefit analysis of various EC2 Auto Scaling policies.
This year is the centenary of A. K. Erlang's paper [1] on the determination of waiting times in an M/D/m queue with $m$ telephone lines.* Today, M/M/m queues are used to model such systems as, call centers [3], multicore computers [4,5] and the Internet [6,7]. Unfortunately, those who should be using M/M/m models often do not have sufficient background in applied probability theory. Our remedy defines a morphing approximation† to the exact M/M/m queue [3] that is accurate to within 10% for typical applications‡. The morphing formula for the residence-time, $R(m,\rho)$, is both simpler and more intuitive than the exact solution involving the Erlang-C function. We have also developed an animation of this morphing process. An outstanding challenge, however, has been to elucidate the nature of the corrections that transform the approximate morphing solutions into the exact Erlang solutions. In this presentation, we show:
- The morphing solutions correspond to the $m$-roots of unity in the complex $z$-plane.
- The exact solutions can be expressed as a rational function, $R(m,z)$.
- The poles of $R(m,z)$ lie inside the unit disk, $|z| < 1$, and converge around the Szego; curve [8] as $m$ is increased.
- The correction factor for the morphing model is defined by the deflated polynomial belonging to $R(m,z)$.
- The pattern of poles in the $z$-plane provides a convenient visualization of how the morphing solutions differ from the exact solutions.
* Originally, Erlang assumed the call holding time, or mean service time $S$, was deterministic with unit period, $S=1$ [1,2]. The generalization to exponentially distributed service periods came later. Ironically, the exponential case is easier to solve than the apparently simpler deterministic case. That's why the M/D/1 queue is never the first example discussed in queueing theory textbooks.
† The derivation of the morphing model is presented in Section 2.6.6 of the 2005 edition of [4], although the word "morphing" is not used there. The reason is, I didn't know how to produce the exact result from it, and emphasizing it would likely have drawn unwarranted attention from Springer-Verlag editors.
By the time I was writing the 2011 edition of [4], I was certain the approximate formula did reflect the morphing concept in its own right, even though I still didn't know how to connect it to the exact result. Hence, the verb "morphs" timidly makes its first and only appearance in the boxed text following equation 4.61.
‡ The relative error peaks at 10% for $m \sim 20$ and $\rho \sim 90\%$, then peaks again at 20% for $m \sim 2000$ and the servers running 99.9% busy. However, the rate of increase in peak error attenuates such that the maximum error is less than 25%, even as $m \rightarrow \infty$ and $\rho \rightarrow 100\%$. A plot of the corresponding curves gives a clearer picture. This behavior is not at all obvious. Prior to this result, it could have been that the relative error climbed to 100% with increasing $m$ and $\rho$.
Presenter: James Brady (co-author: Neil Gunther)
Session Number: 436
Subject Area: APM
Session Date: Wed, November 9, 2016
Session Time: 1:00 PM - 2:00 PM
Session Room: PortofinoB
This is a guest post by performance modeling enthusiast, Mohit Chawla, who brought the following very interesting example to my attention. In contrast to many of the examples in my Perl::PDQ book, these performance data come from a production web server, not a test rig. —NJG
Performance analysts usually build performance models based on their understanding of the software application's behavior. However, this post describes how a PDQ model acted like a periscope and also turned out to be pedagogical by uncovering otherwise hidden details about the application's inner workings and performance characteristics.
Although I've certainly had correct pieces of the puzzle, at various times, I could never get everything to fit in a completely consistent way. No matter how creative I got, I always found a fly in the ointment. The best I had been able to come up with is what I call the "morphing model" approximation where you start out with $m$ parallel queues at low loads and it morphs into a single $m$-times faster M/M/1 queue at high loads.
That model is also exact for $m = 2$ servers—which is some kind of progress, but not much. Consequently, despite a few misplaced enthusiastic announcements in the past, I've never been able to publish the fully corrected morphing model.
Figure 1 is seriously flawed for two reasons:
The idea is that none of the hosts should use more than 75% of their available capacity: the blue areas on the left side of Fig. 1. The total consumed capacity is assumed to be $4 \times 3/4 = 3$ or 300% of the total host configuration (rather than all 4 hosts or 400% capacity). Then, when any single host fails, its lost capacity is compensated by redistributing that same load across the remaining three available hosts (each running 100% busy after failover). As we shall show in the next section, this is a misconception.
The circles in Fig. 1 represent hosts and rectangles represent incoming requests buffered at the load-balancer. The blue area in the circles signifies the available capacity of a host, whereas white signifies unavailable capacity. When one of the hosts fails, its load must be redistributed across the remaining three hosts. What Fig. 1 doesn't show is the performance impact of this capacity redistribution.
R version 2.15.2 (2012-10-26) -- "Trick or Treat"
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: i386-apple-darwin9.8.0/i386 (32-bit)
> library(pdq)
> source("/Users/njg/PDQ/Test Suites/R-Test/mm1.r")
***************************************
****** Pretty Damn Quick REPORT *******
***************************************
*** of : Thu Nov 8 17:42:48 2012 ***
*** for: M/M/1 Test ***
*** Ver: PDQ Analyzer 6.0b 041112 ***
***************************************
***************************************
...
The main trick is that the Perl and Python versions of PDQ will remain entirely unchanged while at the same time invisibly incorporating significant changes to accommodate R.
And, last but not least, queues and computer performance still remain an inevitable perennial. Most recently having to do with the Internet.
My fundamental point is this. When it comes to load testing*, presumably the idea is to exercise the system under test (SUT). Otherwise, why are you doing it? Part of exercising the SUT is to produce significant fluctuations in the number of requests residing in application buffers. Those fluctuations can be induced by the pattern of arriving requests issued by the client-side driver (DVR): usually implemented as a pile of PCs or blades.
The problem is that eqn. \eqref{eqn:badest} grossly underestimates $R_m$, which is precisely the wrong direction for capacity planning scenarios. For that purpose, it's generally better to overestimate performance metrics. That's too bad because it would be a handy Guerrilla-style formula if it did work. You would be able do the calculation in your head and impress everyone on your team (not to mention performing it as a party trick).
Given that eqn. \eqref{eqn:badest} is a poor estimator, you might wonder if there's a better one, and if you'd been working for Thomas Edison he would have told you: "There's a better wsy. Find it!" Easy for him to say. But if you did decide to take up Edison's challenge, how would you even begin to search for such a thing?