Nines | Percent | Downtime/Year | σ Level |
4 | 99.99% | 52.596 minutes | 4σ |
5 | 99.999% | 5.2596 minutes | - |
6 | 99.9999% | 31.5576 seconds | 5σ |
7 | 99.99999% | 3.15576 seconds | - |
8 | 99.999999% | 315.6 milliseconds | 6σ |
In this way, people like to talk about achieving "5 nines" availability or a "six sigma" quality level. These phrases are often bandied about without appreciating:
- that nines and sigmas refer to similar criteria.
- that high nines and high sigmas are very difficult to achieve consistently.
To arrive at the 3rd column of numbers in the table, you can use the following R function to find out how much shorter downtime per year each additional 9 imposes. Hence, the term tyranny.
downt <- function(nines,tunit=c('s','m','h')) {
ds <- 10^(-nines) * 365.25*24*60*60
if(tunit == 's') { ts <- 1; tu <- "seconds" }
if(tunit == 'm') { ts <- 60; tu <- "minutes" }
if(tunit == 'h') { ts <- 3600; tu <- "hours" }
return(sprintf("Downtime per year at %d nines: %g %s", nines, ds/ts,tu))
}
> downt(5,'m')
[1] "Downtime per year at 5 nines: 5.2596 minutes"
> downt(8,'s')
[1] "Downtime per year at 8 nines: 0.315576 seconds"
The associated σ levels correspond to the area under the Normal (Gaussian) or "bell shaped" curve within that 2σ interval centered on the mean (μ). The σ refers to the standard deviation in the usual way. The corresponding area under the Normal curve can be calculated using the following R function:
library(NORMT3)
sigp <- function(sigma) {
sigma <- as.integer(sigma)
apc <- erf(sigma/sqrt(2))
return(sprintf("%d-sigma bell area: %10.8f%%; Prob(chance): %e", sigma, apc*100, 1-apc))
}
> sigp(2)
[1] "2-sigma bell area: 95.44997361%; Prob(chance): 4.550026e-02"
> sigp(5)
[1] "5-sigma bell area: 99.99994267%; Prob(chance): 5.733031e-07"
So, 5σ corresponds to slightly more than 99.9999% of the area under in the bell curve; the total area being 100%. It also corresponds closely to six 9s availability. The 2nd number computed by sigp is the probability that the achieved availability was a fluke. A reasonable mnemonic for some of these values is: - 3σ corresponds roughly to a probability of 1 in 1,000 that four 9s availability occurred by chance.
- 5σ is roughly a 1 in a million chance, which is like flipping a fair coin and getting 20 heads in a row.
- 6σ is roughly a 1 in a billion chance that it was a fluke.
How often are distributions - even of sample means - normal out to 6-sigmas?
ReplyDeleteHow often is it the case that actual data distributions - even of sample averages - are sufficiently close to normal out to 6 sigma that those percentages are at all meaningful?
ReplyDeleteThat, of course, is part of the point of this post: big talk, little measurement.
ReplyDeleteMajor web sites often do have enough data to support claims about 3 to 5 nines availability.
As an aside, particle physics measurements require 5sigma levels to be considered valid. Current measurements at the LHC, regarding the existence of the Higgs boson, are more like 2sigma.
My apologies about posting the question twice. The first time, the message I got seemed to suggest it hadn't posted.
ReplyDeleteNo worries.
ReplyDeleteI could delete one, but I would have to get really worked up about it. :)
Here's another interesting tidbit: how's the expected deviation of a series conditional upon being >= 6 sigmas different than any other sigma.
ReplyDeleteI'll see that and raise you 5 ...sigma? :)
ReplyDeleteThere's something of a technical contradiction in attempting to apply "6sigma" improvement (e.g., SPC) to computer performance data. The assumption is that the samples are discrete point events (defects), whereas almost all computer performance data are time series where the events have already been averaged over some predefined sampling time interval.
This just in from Twitter...
ReplyDelete@OReillyMedia O'Reilly Media
100% uptime. It's not just about technology. CIO @Reichental discusses the value of good change management: oreil.ly/mTlYDw
Salesforce.com is holding a big conference in downtown San Francisco at the moment (30,000 attendees) and I happened across this comment in their Wikipedia entry:
ReplyDelete"The service has suffered some downtime; during an outage in January 2009 services were unavailable for at least 40 minutes, affecting thousands of businesses."
But that's about four 9s availability!
Of course, they couldn't afford another outage in 2009 in order to maintain that availability level. And that level is the statistical mean, which says nothing about variance. :)
Hey,
ReplyDeleteVery nice site. I came across this on Google, and I am stoked that I did.
I will definitely be coming back here more often.
Wish I could add to the conversation and bring a bit more to the table, but am just taking in as much info as I can at the moment.Six Sigma Certification
Hi Carl,
ReplyDeleteThank you and welcome!
BTW, this sigmas business is going to become a hot topic on the web as of tomorrow when CERN makes their public statement about the status of the latest Higgs data from the LHC.
For example, and quoting:
"... 5 sigma, meaning that it has just a 0.00006% chance of being wrong. The ATLAS and CMS experiments are each seeing signals between 4.5 and 5 sigma, just a whisker away from a solid discovery claim."
5 sigma is the minimum bar for particle physics data. It's also just as well to keep in mind that the confidence level doesn't tell the whole story.
Last year, a different CERN-related experiment was seeing superluminal neutrinos (i.e., Einstein busters) at the 6-sigma level. After eventually finding the loose connector in their detector, those revolutionary neutrinos suddenly disappeared http://is.gd/WWaj7F
Forearmed by that fiasco, The Higgs boys are very unlikely to have that problem, but they are still going to have to demonstrate that their 4+ sigma bumps are really The Higgs.