Sunday, May 20, 2018

USL Scalability Modeling with Three Parameters

NOTE: Annoyingly, the remote mathjax server often takes it's sweet time rendering LaTex equations (like, maybe a minute!!!). I don't know if this is deliberate on the part of Google or a bug. It used to be faster. If anyone knows, I'd be interested to hear; especially if there is a way to speed it up. And no, I'm not planning to move to WordPress.

The 2-parameter USL model

The original USL model, presented in my GCAP book and updated in the blog post How to Quantify Scalability, is defined in terms of two fitting parameters $\alpha$ (contention) and $\beta$ (coherency). \begin{equation} X(N) = \frac{N \, X(1)}{1 + \alpha \, (N - 1) + \beta \, N (N - 1)} \label{eqn: usl2} \end{equation}

Fitting this nonlinear USL equational model to data requires several steps:

  1. normalizing the throughput data, $X$, to determine relative capacity, $C(N)$.
  2. equation (\ref{eqn: usl2}) is equivalent to $X(N) = C(N) \, X(1)$.
  3. if the $X(1)$ measurement is missing or simply not available—as is often the case with data collected from production systems—the GCAP book describes an elaborate technique for interpolating the value.
The motivation for a 2-parameter model arose out of a desire to meet the twin goals of:
  1. providing each term of the USL with a proper physical meaning, i.e., not treat the USL like a conventional multivariate statistical model (statistics is not math)
  2. satisfying the von Neumann criterion: minimal number of modeling parameters
Last year, I realized the 2-paramater constraint is actually overly severe. Introducing a third parameter would make the statistical fitting process even more universal, as well as simplify the overall procedure. For the USL particularly, the von Neumann criterion should not be taken too literally. It's really more of a guideline: fewer is generally better.