In a
previous post, I applied my rules-of-thumb for
response time (RT) percentiles (or more accurately, residence time in queueing theory parlance), viz., 80th percentile: $R_{80}$, 90th percentile: $R_{90}$ and 95th percentile: $R_{95}$ to a cellphone application and found that the performance measurements were not completely consistent. Since the relevant data only appeared in a journal blog, I didn't have enough information to resolve the discrepancy; which is ok. The first job of the performance analyst is to flag performance anomalies but most probably let others resolve them—after all, I didn't build the system or collect the measurements.
More importantly, that analysis was for a single server application (viz., time-to-first-fix latency). At the end of my post, I hinted at adding percentiles to PDQ for multi-server applications. Here, I present the corresponding rules-of-thumb for the more ubiquitous multi-server or multi-core case.
Single-server Percentiles
First, let's summarize the Guerrilla rules-of-thumb for single-server percentiles (M/M/1 in queueing parlance):
\begin{align}
R_{1,80} &\simeq \dfrac{5}{3} \, R_{1} \label{eqn:mm1r80}\\
R_{1,90} &\simeq \dfrac{7}{3} \, R_{1}\\
R_{1,95} &\simeq \dfrac{9}{3} \, R_{1} \label{eqn:mm1r95}
\end{align}
where $R_{1}$ is the statistical mean of the measured or calculated RT and $\simeq$ denotes
approximately equal. A useful mnemonic device is to notice the numerical pattern for the fractions. All denominators are 3 and the numerators are successive
odd numbers starting with 5.