In the meantime, yesterday, I attended Gerwin Hendriksen's talk "GAPP Improvements: a Method to Diagnose and Predict Performance in Complex Architectures," where he updated us on how he has now applied his approach to both benchmark and production data. This was a good presentation because Gerwin has done a lot of work to demonstrate that his approach is valid. When I first heard his original Hotsos presentation in 2008, I understood the concepts but I was skeptical because it sounded too good to be true. It wasn't clear to me that he could really deliver on his promise—maybe there were gaps in the GAPP?
BTW: GAPP is Gerwin's acronym for “General Approach Performance Profiling” and should not be confused with GCaP: Guerrilla Capacity Planning. GAPP reminds me of traveling on the London underground railway, where you find yourself assailed by an announcement repeatedly warning you to "Mind the gap" (the gap between the platform and the train doors).
It's difficult for us mortals to grok what Gerwin is doing with GAPP because it's a kind of intergalactic synthesis of several disparate modeling techniques, which he combines in a unique way to produce both bottleneck analysis and response time projections ... seemingly out of nowhere! I say nowhere, because GAPP does not require special data collection agents, intrusive tracing hooks in the application or special profiling tools. Moreover, GAPP doesn't care whether you want to do capacity planning for a simple system, like your laptop (which Gerwin demo-ed in his talk), or an extremely complex, multi-tiered environment, like Facebook.com—the same methodology applies. Sounds a bit incredible, doesn't it?
Here's my nutshell version of why it works:
- The performance data sources are non-specific
- This apparent arbitrariness is offset by the application of robust statistical techniques
- Multivariate linear regression (that's count 1) is used to estimate the response times of selected applications
R̂A = α1 R1A + α2 R2A + ... + αn RnA + β, (1)
- On the left-hand side of eqn.(1), R̂A is the response-time estimator for application-A (or B, C, etc.)
- On the right, the regressors (R1A, R2A, ...) in eqn.(1) represent residence times for application-A on different tiers, for example
- GAPP determines which regressors are most significant by applying factor analysis (that's count 2)
- These residence times are further modeled using the Erlang-C formula (that's count 3)
- Capacity planning projections are facilitated because GAPP also determines how these regressors vary with utilization (load)
- Multilinear regression (See Section 8.6.1 "Multivariate Regression of Daily Data" in my GCaP book for a simpler example)
- Queueing theory (See my Perl::PDQ book)
- Factor analysis (covered in the Guerrilla Data Techniques class)
This is how I grill people (in the Bar & Grill)
It was for that reason that I grilled Gerwin after his 2008 Hotsos presentation. I think it's fair to say that discussion provided him with a good deal of impetus to perform the additional work he presented in yesterday's talk. But even that was not enough. It took another six hours of grilling him yesterday afternoon (post presentation), and plying him with another $60 worth of wine, to see that Gerwin really does understand how it all works in detail. That, after all, is what makes Hotsos such a valuable symposium.
So, if you didn't attend his talk and would like to see for yourself, you can download his more thorough whitepaper. Perhaps this blog post will act as a brief guide for mining Gerwin's GAPP. While taking the tour, however, do mind the intellectual leaps ... or gaps.
No comments:
Post a Comment