The Pith of Performance: Excel Errors and Other Numerical Nightmares

Wednesday, August 25, 2010

Excel Errors and Other Numerical Nightmares

Although I use Excel all the time, and I strongly encourage my students to use it for performance analysis and CaP, I was forced to include a warranty disclaimer in my GCaP book because I discovered a serious numerical error while writing Appendix B. There, my intention was just to show that Excel gives essentially the same results as Mathematica when using the USL scalability model. It didn't!

This topic came up in the recent Guerrilla Data Analysis Techniques class. Here, I'm repeating the exercise but comparing USLcalc.xls with R instead of Mathematica—the latter two tools being in sufficient agreement. Since the input data set is small, I'll use R's textConnection function, which fools read.table into thinking it's digesting an external file.


input <- read.table(textConnection("p X_p
1   20
4   78
8   130
12  170
16  190
20  200
24  210
28  230
32  260
48  280
64  310"), header=TRUE)
closeAllConnections()

The processor configuration is denoted by p and the corresponding throughput (as described in Chapter 5) is denoted by X_p. Fig. 1 shows the regression results in R. The R script for applying the USL model is available here.

Fig. 1 USL regression against 64-way processors

The excitement really begins when we use the results of the regression fit to make projections about larger configurations. Fig. 2 shows the results of projecting out to p = 500 processors. In particular, it shows the predictions for the location of the maximum in each scalability curve. Excel predicts p_max = 435, whereas R predicts p_max = 288 (we round down): a difference of 50% more hardware according to Excel.

Fig. 2 Projection out to 512-way processors

In this hardware series (SGI Origin NUMA) the backplane did not support more than 64 processors (or boards at that time), and it's clear that scalability is all over, bar the shouting, by p = 100 or 128 processors; all other things being equal. But it doesn't take much imagination to see what happens with potential expansion to p = 256 or 512 processors or cores or GPUs, in future generations. And that's what CaP is about. Similarly, the x-axis could be a representation of N software users rather than hardware processors.

The precision problem arises in the computation of p_max, which is proportional to κ^-1/2. Excel produces a value of κ ~ 10^-6, while R and Mathematica yield κ ~ 10^-5; different by an order of magnitude. Both Mathematica and R use infinite precision in software, so they are less prone to this kind of error. But see the Comments below for more discussion on this point [--njg: Sun, Aug 29, 2010].

Microsoft does acknowledge these issues on their web site, but this is not just an Excel problem. See my previous blog post about wobbly numbers in Perl PDQ, which was not a problem with PDQ, BTW.

Prof. Walter Gander of ETH in Switzerland reveals how similar precision problems can arise in Matlab, although it's not a Matlab problem but an IEEE precision problem arising from the bit-width format used to represent FP numbers in hardware. What's a person to do? I have to agree with the Prof. who says:

In any case I think it is important to teach our students that computing with real numbers deserves special care and attention. It is obviously not enough to write mathematically correct programs - they can also exhibit weird behavior.

In other words, just like any performance analysis, you can never be too careful.

5 comments:

Mike Lawrence said...: It appears that Apple's Numbers.app provides yet a third solution:
y = -9.2150E-05x^2 + 0.0575x + 0.1098; Thursday, August 26, 2010 at 7:29:00 AM PDT
Neil Gunther said...: You must've missed the part in Chap. 5 of my GCaP book where I explain that the y-intercept must be zero, otherwise the scalability problem is under-constrained.; Thursday, August 26, 2010 at 7:50:00 AM PDT
Anonymous said...: I am using Excel 2003 and didn't find nearly as large a discrepancy when solving in similar ways on both platforms.

In Excel I used solver to carry out a least-squares fit of the model (similar to how things were solved in R) to get the parameters alpha=0.0498, beta=1.11e-5. I then used Solver to maximize the function with these paramters and got pmax=292.0.

These results were in reasonable agreement with what I got in R using nls (like you did) to find the parameters alpha=0.0498, beta=1.14e-5 and then using optimize to get pmax = 288.3.

I suspect that the big discrepancy is from using different solution approaches and not from the different accuracy of the tools.; Thursday, August 26, 2010 at 8:47:00 PM PDT
Anonymous said...: BTW, I reran the fit in R using your linear regression approach . The regression coefs from running lm() were 5.003e-02 (linear term) and 4.708e-06 (quadratic term). These gave sigma = 0.050023 and kappa = 4.708e-06. For these parameter values I found pmax = 449.2.

The only difference from your Excel results is that your linear regression coefs were rounded in the spreadsheet. (From reading them off the graph rather than calculating them on the sheet?)

In other words, the only real difference in results is due to the estimation approach (linear regression of transformed equation versus non-linear least squares fit on original equation) and not the tool.

(PS - What I called alpha in the prior post should have been called sigma and what I called beta should have been called kappa.); Thursday, August 26, 2010 at 10:13:00 PM PDT
Neil Gunther said...: Nice little study there. Well done!

So, in this case, it's not so much a question of numerical precision as the method used by Excel under the camouflage of the Add Trend Line fit (linear model) even though we already selected a nonlinear quadratic as the model to be fitted. Scary. At least in R and MMA you see what you're getting.

Looks like caution is still the watchword.; Sunday, August 29, 2010 at 9:13:00 AM PDT