Thursday, March 24, 2011

CMG Atlanta: Southbound for the Deep South



I will be at the CMG Greater Atlanta Spring conference on April 27, 2011. I was asked to cover something for both veterans and newcomers to capacity planning—along the lines of my Guerrilla Boot Camp classes. So, here's what I came up with.
Guerrilla CaP for Noobs and Nerds

Whether you're a newbie (noob) or a veteran (nerd) when it comes to capacity planning (CaP) and performance analysis, it never hurts to revisit the fundamentals. However, some CaP concepts that are touted as fundamental are actually myths. Here are some myths I hear all too often.

What's NOT:

  1. We don't need no stinkin' CaP, just more cheap servers.
  2. CPU utilization should never exceed 70% busy.
  3. A well-consolidated server should have no idle cycles.
  4. Throughput and latency are independent metrics that must be measured separately.
  5. Optimal response time is achieved at the knee of the curve.
  6. If you can measure it, you can manage it.
During my twin session I will take these myths apart to expose the facts in terms of

What's HOT:

  1. If the app is single-threaded, a boat-load of cheap servers from China won't help.
  2. A 64-way server running 70% busy is 25% underutilized.
  3. A consolidated server may need to be under 10% busy to meet app SLAs.
  4. Throughput and latency are inversely related ... always!
  5. Response time knees are an optical illusion.
  6. All performance measurements are wrong by definition.
Along the way, I'll offer some Guerrilla mantras, as seen in my Guerrilla book and generated automatically on Twitter. You can use them as weapons of mass instruction to bust any other myths held by your colleagues and managers, whether you're a noob or nerd.

† With apologies to Paris Hilton.

Wednesday, March 16, 2011

PDQ Models: Show Me the Code

The road to hell is paved with good intentions

During a recent exchange with a potential PDQ user, that occurred as a sequence of blog comments related to my earlier post about getting started wtih PDQ, I thought we were very close to converging on the appropriate queueing model for his problem, and that it would make a nice addition to the models discussed in my Perl::PDQ book. In fact, I thought the model looked like this


which involves a feedback flow of requests to the Authent-ification server.

Wednesday, March 9, 2011

Hotsos 2011: Brooks, Cooks, Delay and This Just In ...

Thanks to all those who attended my presentation and offered me their compliments afterwards. It was a bit rushed and went a bit wobbly when it came to the description of the repairman queueing model (the Apple Genius Bar), but I knew that might happen going in. Despite my best efforts to muddle it at times, it seems people were able to take away a coherent (pun!) message. That was also evident from the excellent audience questions, as well as some of the tweets I've seen. Thank you.

Tuesday, March 8, 2011

Hotsos 2011: Mine the GAPP

It's that time of year again so, here I am in Dallas to present "Brooks, Cooks, and Response Time Scalability," where I will be showing how my universal scalability law (USL) can be applied to quantifying response-time scaling; as opposed to the more typical throughput scaling.

Friday, February 4, 2011

USL Fine Point: Sub-Amdahl Scalability

As discussed in Chapter 4 of my GCaP book, Amdahl's law is defined by a single parameter called the serial fraction, denoted by the symbol α and signifying the proportion of the total workload (W) that is serialized during execution. From the standpoint of parallel processing (where reference to Amdahl's law is most frequent) serialization means that portion of the workload can only execute on a single processor out of N parallel processors. The parallel speedup or relative capacity CA(N) performance metric is given by: \begin{equation} C_A(N) = \frac{N}{1 + \alpha \, (N-1)} \end{equation} If there is no serialization in the workload, i.e., α = 0, then CA(N) = N, which signifies that the workload scales linearly with the number of physical processors. The important observation made by Gene Amdahl (more than 40 years ago) is that even if α is relatively small, viz., a few percent of the execution time, scalability cannot continue to increase linearly. For example, if α = 5%, then CA(N) will eventually reach a scalability ceiling given by 20 effective processors (1/α), even if there are hundreds of physical processors available in the system.

Thursday, January 27, 2011

Idleness Is Not Waste

A common fallacy is to view all idle CPU cycles as wasted server capacity. It's not unusual for management and various bean-counters to display a reluctance to procure new hardware if unused cycles are clearly observable on existing hardware. This puts the pressure on sys admins to reduce idleness. Such is often the case during consolidation efforts: cram as many apps as possible onto a server to soak up every remaining CPU cycle.

All performance analysis and capacity planning is essentially about optimizing resource usage under a particular set of constraints. The fallacy is treating maximization as optimization. This mistake is further exacerbated if only one performance metric, i.e., CPU utilization, is taken into account: a common situation promoted by the superficiality of performance dashboards. Maximization doesn't necessarily mean 100% utilization, either. The same is true even if some amount of CPU capacity is retained as headroom for workload growth. The tendency to "redline" it can still prevail.

You can't optimize a single number. Server utilization has to be optimized with respect to other measures, e.g., application response-time targets. We know from simple queueing theory that response time increases nonlinearly (the proverbial "hockey stick") with increasing server utilization. If the response-time goals are being met at 10% CPU busy, pre-consolidation, then almost certainly they will be exceeded at higher CPU utilization, post-consolidation. The response-time metric is an example of a cost that has to be taken into account to satisfy all the constraints of the optimized capacity plan.

Maximizing server utilization is as foolhardy as maximizing revenue. Both goals look attractive on their face, but if you don't keep track of outgoing CapEx and OpEx costs incurred to generate revenue, you could lose the company!

Wednesday, January 26, 2011