Possibly pithy insights into computer performance analysis and capacity planning based on the Guerrilla series of books and training classes provided by Performance Dynamics Company.
Thursday, May 8, 2025
How to Quantify Scalability
Tuesday, May 6, 2025
Performance Ponderings Updated
Amazingly, performance analysis (done right) is often not technology-dependent, per se. The right abstraction can remain invariant into perpetuity. For example, my 2025 analysis of the GPT Efficient Computer Frontier (ref. 8 in paper 1) is based partly on my 1989 analysis of virtual memory thrashing in paper 53. Same paradigm; vastly different context.
The connection between the two papers is by no means obvious but, quite striking (not to mention satisfying) when you realize it.
Tuesday, April 20, 2021
PDQ Online Workshop, May 17-21, 2021
All modern computer systems, no matter how complex, can be thought of as a directed graph of individual buffers that hold requests until to be serviced at a shared computational resource, e.g., a CPU or disk. Since a buffer is just a queue, any computer infrastructure, from your laptop up to Facebook.com, can be represented as a directed graph of queues.
The directed arcs or arrows in such a graph correspond to workflows between different queues. In the parlance of queueing theory, a directed graph of queues is called a queueing network model. PDQ is a tool for predicting performance metrics such as, waiting time, throughput, optimal user-load.Two major benefits of using PDQ are:
- confirmation that monitored performance metrics have their expected values
- predict performance for circumstances that lie beyond current measurements
Thursday, November 26, 2020
PDQ 7.0 is Not a Turkey
New Featues
- The introduction of the STREAMING solution method for OPEN queueing networks. (cf. CANON, which can still be used).
- The CreateMultiNode() function is now defined for CLOSED queueing networks and distinguished via the MSC device type (cf. MSO for OPEN networks).
- The format of Report() has been modified to make the various types of queueing network parameters clearer.
- See the R Help pages in RStudio for details.
- Run the demo(package="pdq") command in the R console to review a variety of PDQ 7 models.
Maintenance Changes
The migration of Python from 2 to 3 has introduced maintenance complications for PDQ. Python 3 may eventually be accommodated in future PDQ releases. Perl maintenance has ended with PDQ release 6.2, which to remain compatible with the Perl::PDQ book (2011).Wednesday, January 2, 2019
DSConf 2019 Featured Talk
DSConf'19 - Distributed Systems Conference (scroll down)
Pune, India
11am IST
February 16
I'm very much looking forward to this event and I thank @ShrivedAgashe for the invitation.
Monday, June 25, 2018
Guerrilla 2018 Classes Now Open
- GCAP: Guerrilla Capacity and Performance — From Counters to Containers and Clouds
- GDAT: Guerrilla Data Analytics — Everything from Linear Regression to Machine Learning
- PDQW: Pretty Damn Quick Workshop — Personal tuition for performance and capacity mgmt
The following highlights indicate the kind of thing you'll learn. Most especially, how to make better use of all that monitoring and load-testing data you keep collecting.
- How to save millions of dollars with a one-line performance model (video)
- How to minimize chargeback after you lift and shift to the cloud (video)
- How to correctly emulate web traffic on a load-testing rig (PDF)
See what Guerrilla grads are saying about these classes. And how many instructors do you know that are available for you from 9am to 9pm (or later) each day of your class?
Who should attend?
- IT architects
- Application developers
- Performance engineers
- Sysadmins (Linux, Unix, Windows)
- System engineers
- Test engineers
- Mainframe sysops (IBM. Hitachi, Fujitsu, Unisys)
- Database admins
- Devops practitioners
- SRE engineers
- Anyone interested in getting beyond performance monitoring
As usual, Sheraton Four Points has bedrooms available at the Performance Dynamics discounted rate. The room-booking link is on the registration page.
Tell a colleague and see you in September!
Wednesday, June 20, 2018
Chargeback in the Cloud - The Movie
The details concerning how you can do this kind of cost-benefit analysis for your cloud applications will be discussed in the upcoming GCAP class and the PDQW workshop. Check the corresponding class registration pages for dates and pricing.
Sunday, May 20, 2018
USL Scalability Modeling with Three Parameters
Update of Oct 2018: Wow! MathJax performance is back. Clearly, whinging is the most powerful performance optimizer. :)
The 2-parameter USL model
The original USL model, presented in my GCAP book and updated in the blog post How to Quantify Scalability, is defined in terms of fitting two parameters $\alpha$ (contention) and $\beta$ (coherency). \begin{equation} X(N) = \frac{N \, X(1)}{1 + \alpha \, (N - 1) + \beta \, N (N - 1)} \label{eqn: usl2} \end{equation}Fitting this nonlinear USL equational model to data requires several steps:
- normalizing the throughput data, $X$, to determine relative capacity, $C(N)$.
- equation (\ref{eqn: usl2}) is equivalent to $X(N) = C(N) \, X(1)$.
- if the $X(1)$ measurement is missing or simply not available—as is often the case with data collected from production systems—the GCAP book describes an elaborate technique for interpolating the value.
- providing each term of the USL with a proper physical meaning, i.e., not treat the USL like a conventional multivariate statistical model (statistics is not math)
- satisfying the von Neumann criterion: minimal number of modeling parameters
Sunday, April 22, 2018
The Geometry of Latency
Here's a mnemonic tabulation based on dishes and bowls:
Hopefully this makes amends for the more complicated explanation I wrote for CMG back in 2009 entitled: "Mind Your Knees and Queues: Responding to Hyperbole with Hyperbolæ", which I'm pretty sure almost nobody understood.
Saturday, April 21, 2018
Virtual cloudXchange 2018 Conference
Exposing the Cost of Performance
Hidden in the Cloud
Neil Gunther
Performance Dynamics, Castro Valley, California
Mohit Chawla
Independent Systems Engineer, Hamburg, Germany10am Pacific Time on June 19, 2018
Whilst offering lift-and-shift migration and versatile elastic capacity, the cloud also reintroduces an old mainframe concept—chargeback—which rejuvenates the need for performance analysis and capacity planning. Combining production JMX data with an appropriate performance model, we show how to assess fee-based EC2 configurations for a mobile-user application running on a Linux-hosted Tomcat cluster. The performance model also facilitates ongoing cost-benefit analysis of various EC2 Auto Scaling policies.
Wednesday, March 14, 2018
WTF is Modeling, Anyway?
The strength of the model turns out to be its explanatory power, rather than prediction, per se. However, with the correct explanation of the performance problem in hand (which also proved that all other guesses were wrong), this model correctly predicted a 300% reduction in application response time for essentially no cost. Modeling doesn't get much better than this.
Footnotes
- According to Computer World in 1999, a 32-node IBM SP2 cost $2 million to lease over 3 years. This SP2 cluster was about 6 times bigger.
- Because of my vain attempt to suppress details (in the interests of video length), Boris gets confused about the kind of files that are causing the performance problem (near 26:30 minutes). They're not regular data files and they're not executable files. The executable is already running but sometimes waits—for a long time. The question is, waits for what? They are, in fact, special font files that are requested by the X-windows application (the client, in X parlance). These remote files may also get cached, so it's complicated. In my GCAP class, I have more time to go into this level of detail. Despite all these potential complications, my 'log model' accurately predicts the mean application launch time.
- Log_2 assumes a binary tree organization of font files whereas, Log_10 assumes a denary tree.
- Question for the astute viewer. Since these geophysics applications were all developed in-house, how come the developers never saw the performance problems before they ever got into production? Here's a hint.
- Some ppl have asked why there's no video of me. This was the first time Boris had recorded video of a Skype session and he pushed the wrong button (or something). It's prolly better this way. :P
Wednesday, March 15, 2017
Morphing M/M/m: A New View of an Old Queue
- Update July 31, 2017: Here are my IFORS slides
- Update June 08, 2018: In response to an audience question at my IFORS 2017 session, I have now demonstrated that there is an upper bound for the error in the morphing approximation. See footnotes below.
- Update August 18, 2020: This new paper has the complete exact solution method that I had been seeking all along.
This year is the centenary of A. K. Erlang's paper [1] on the determination of waiting times in an M/D/m queue with $m$ telephone lines.* Today, M/M/m queues are used to model such systems as, call centers [3], multicore computers [4,5] and the Internet [6,7]. Unfortunately, those who should be using M/M/m models often do not have sufficient background in applied probability theory. Our remedy defines a morphing approximation† to the exact M/M/m queue [3] that is accurate to within 10% for typical applications‡. The morphing formula for the residence-time, $R(m,\rho)$, is both simpler and more intuitive than the exact solution involving the Erlang-C function. We have also developed an animation of this morphing process. An outstanding challenge, however, has been to elucidate the nature of the corrections that transform the approximate morphing solutions into the exact Erlang solutions. In this presentation, we show:
- The morphing solutions correspond to the $m$-roots of unity in the complex $z$-plane.
- The exact solutions can be expressed as a rational function, $R(m,z)$.
- The poles of $R(m,z)$ lie inside the unit disk, $|z| < 1$, and converge around the Szego; curve [8] as $m$ is increased.
- The correction factor for the morphing model is defined by the deflated polynomial belonging to $R(m,z)$.
- The pattern of poles in the $z$-plane provides a convenient visualization of how the morphing solutions differ from the exact solutions.
* Originally, Erlang assumed the call holding time, or mean service time $S$, was deterministic with unit period, $S=1$ [1,2]. The generalization to exponentially distributed service periods came later. Ironically, the exponential case is easier to solve than the apparently simpler deterministic case. That's why the M/D/1 queue is never the first example discussed in queueing theory textbooks.
† The derivation of the morphing model is presented in Section 2.6.6 of the 2005 edition of [4], although the word "morphing" is not used there. The reason is, I didn't know how to produce the exact result from it, and emphasizing it would likely have drawn unwarranted attention from Springer-Verlag editors.
By the time I was writing the 2011 edition of [4], I was certain the approximate formula did reflect the morphing concept in its own right, even though I still didn't know how to connect it to the exact result. Hence, the verb "morphs" timidly makes its first and only appearance in the boxed text following equation 4.61.
‡ The relative error peaks at 10% for $m \sim 20$ and $\rho \sim 90\%$, then peaks again at 20% for $m \sim 2000$ and the servers running 99.9% busy. However, the rate of increase in peak error attenuates such that the maximum error is less than 25%, even as $m \rightarrow \infty$ and $\rho \rightarrow 100\%$. A plot of the corresponding curves gives a clearer picture. This behavior is not at all obvious. Prior to this result, it could have been that the relative error climbed to 100% with increasing $m$ and $\rho$.
References
- A. K. Erlang, "Solution of Some Problems in the Theory of Probabilities of Significance in Automatic Telephone Exchanges," Electroteknikeren, v. 13, p. 5, 1917.
- A. K. Erlang, "The Theory of Probabilities and Telephone Conversations," Nyt Tidsskrift for Matematik B, vol 20, 1909.
- E. Chromy, T. Misuth, M. Kavacky, "Erlang C Formula and Its Use In The Call Centers," Advances in Electrical and Electronic Engineering, Vol. 9, No. 1, March 2011.
- N. J. Gunther, Analyzing Computer System Performance with Perl::PDQ, Springer-Verlag, 2005 and 2011.
- N. J. Gunther, S. Subramanyam, and S. Parvu, "A Methodology for Optimizing Multithreaded System Performance on Multicore Platforms," in Programming Multicore and Many-core Computing Systems, eds. S. Pllana and F. Xhafa, Wiley Series on Parallel and Distributed Computing, February 2017.
- N. J. Gunther, "Numerical Investigations of Physical Power-law Models of Internet Traffic Using the Renormalization Group," IFORS 2005, Honolulu, Hawaii, July 11—15.
- T. Bonald, J. W. Roberts, "Internet and the Erlang formula," ACM SIGCOMM Computer Communication Review, Volume 42, Number 1, January 2012.
- C. Diaz Mendoza and R. Orive, "The Szegő curve and Laguerre polynomials with large negative parameters," Journal of Mathematical Analysis and Applications, Volume 379, Issue 1, Pages 305—315, 1 July 2011.
Tuesday, January 17, 2017
GitHub Growth Appears Scale Free
"As GitHub approaches 700 employees, with more than $200M in ARR, accelerating growth, and more than 20 million registered users, I'm confident that this is the moment to find a new CEO to lead us into the next stage of growth. ....."
The Original Analysis
In 2013, a Redmonk blogger claimed that the growth of GitHub (GH) users follows a certain type of diffusion model called Bass diffusion. Here, growth refers to the number of unique user IDs as a function of time, not the number project repositories, which can have a high degree of multiplicity.
In a response, I tweeted a plot that suggested GH growth might be following a power law, aka scale free growth. The tell-tale sign is the asymptotic linearity of the growth data on double-log axes, which the original blog post did not discuss. The periods on the x-axis correspond to years, with the first period representing calendar year 2008 and the fifth period being the year 2012.
Saturday, October 1, 2016
A Clue for Remembering Little's Laws
The right-hand side of each equation representing a version of Little's law is written vertically in the order $R, W, S$, which matches the expression $R=W+S$ for the mean residence time, viz., the sum of the mean waiting time ($W$) and the mean service time ($S$).
The letters on the left-hand side: $Q, L, U$ (reading vertically) respectively correspond to the queueing metrics: queue-length, waiting-line length, and utilization, which can be read as the word clue.
Incidentally, the middle formula is the version that appears in the title of John Little's original paper:
J. D. C. Little, ``A Proof for the Queuing Formula: L = λ W,''
Operations Research, 9 (3): 383—387 (1961)
Wednesday, August 3, 2016
PDQ as a Performance Periscope
This is a guest post by performance modeling enthusiast, Mohit Chawla, who brought the following very interesting example to my attention. In contrast to many of the examples in my Perl::PDQ book, these performance data come from a production web server, not a test rig. —NJG
Performance analysts usually build performance models based on their understanding of the software application's behavior. However, this post describes how a PDQ model acted like a periscope and also turned out to be pedagogical by uncovering otherwise hidden details about the application's inner workings and performance characteristics.
Some Background
Thursday, July 28, 2016
Erlang Redux Resolved! (This time for real)
Although I've certainly had correct pieces of the puzzle, at various times, I could never get everything to fit in a completely consistent way. No matter how creative I got, I always found a fly in the ointment. The best I had been able to come up with is what I call the "morphing model" approximation where you start out with $m$ parallel queues at low loads and it morphs into a single $m$-times faster M/M/1 queue at high loads.
That model is also exact for $m = 2$ servers—which is some kind of progress, but not much. Consequently, despite a few misplaced enthusiastic announcements in the past, I've never been able to publish the fully corrected morphing model.
Friday, May 13, 2016
How to Emulate Web Traffic Using Standard Load Testing Tools
How to Emulate Web Traffic Using Standard Load Testing ToolsJames Brady (State of Nevada) and Neil Gunther (Performance Dynamics)
Conventional load-testing tools are based on a fifty year old time-share computer paradigm where a finite number of users submit requests and respond in a synchronized fashion. Conversely, modern web traffic is essentially asynchronous and driven by an unknown number of users. This difference presents a conundrum for testing the performance of modern web applications. Even when the difference is recognized, performance engineers often introduce virtual-user script modifications based on hearsay; much of which leads to wrong results. We present a coherent methodology for emulating web traffic that can be applied to existing test tools.
Keywords: load testing, workload simulation, web applications, software performance engineering, performance modeling
Related blog posts:
Monday, August 24, 2015
PDQ Version 6.2.0 Released
This minor release is now available for download.
Sunday, July 20, 2014
Continuous Integration Gets Performance Testing Radar
In this webinar, Neil Gunther (Performance Dynamics Company) and Sai Subramanian (Cognizant Technology Solutions) will present a new type of model that is not a simulation, but instead acts like continuous radar that warns developers of potential performance and scalability issues during the CI process. This radar model corresponds to a virtual testing framework that precludes the need for developing performance test scripts or setting up a separate load testing environment. Far from being a mere idea, radar methodology is based on a strong analytic foundation that will be demonstrated by examining a successful case study.
Broadcast Date and Time: Tuesday, July 22, 2014, at 11 am Pacific
Thursday, July 17, 2014
Restaurant Performance Sunk by Selfies
Using the data supplied in the story, I wanted to see how the restaurant performance would look when expressed as a PDQ model. First, I created a summary data frame in R, based on the observed times:
> df
obs.2004 obs.2014
wifi.data 0 5
menu.data 8 8
menu.pix 0 13
order.data 6 6
eat.mins 46 43
eat.pix 0 20
paymt.data 5 5
paymt.pix 0 15
total.mins 65 115
The 2004 situation can be represented schematically by the following queueing network
Referring to Figure 1: