tag:blogger.com,1999:blog-6977755959349847093.post4567458771625905778..comments2018-01-09T07:18:41.069-08:00Comments on The Pith of Performance: Assessing USL Scalability with Mixed Business FunctionsNeil Guntherhttp://www.blogger.com/profile/11441377418482735926noreply@blogger.comBlogger2125tag:blogger.com,1999:blog-6977755959349847093.post-7007512420083208262009-04-15T12:59:00.000-07:002009-04-15T12:59:00.000-07:00This seemingly simple question is, in fact, quite ...This seemingly simple question is, in fact, quite deep and worth a blog post in its own right. I will endeaor to get around to that. In the meantime, let me give you the nickel version.<br /><br />Nominally, I'm saying you should have half a dozen load pts. Sometimes, in GCaP classes I've been pressed for a lower number and responded that 4 is the rock-bottom limit. In the GCaP book I show the difference in USL prediction with 4 data points.<br /><br />The general argument goes something like this. If curve fitting (in the sense of splines): which I stress, we're not, the simplest fit (model) is the one with least number of extrema and pts of inflection:<br /><br />1 pt fit for nothing<br />2 pts fit with straight line<br />3 pts fit with quadratic (degree-2 polynomial) <br />4 pts fit with degree-3 polynomial<br /><br />and so on. We can always assume we have the trivial pt at the origin: zero throughput @ zero load (N = 0). The 2 main cases where statistical regression is used are:<br /><br />1. Determine a model from a set of equations: linear, quadratic, log, logistic, etc.<br /><br />2. Have a model (e.g., USL), determine its coefficients (parameters).<br /><br />The USL model is a *rational function*, not a polynomial. Note therefore, that it does not appear in the choice of Excel models, because fitting a rational function is tricky. R and Mathematica can do it.However, 2-3 pts could appear very linear for USL (see previous blog post). 4 pts could also fit too well! Even R^2 and the residuals might look good.<br /><br />6 pts is less likely to fit like a spline.<br /><br />We can't apply Confidence Intervals w/o multiple sets of measurements (runs). Typically, this is never done on a test rig with LR ("no time" is the usual excuse). Strangely, multiple data sets may be more available on a prod system by looking at the same windows on different days. Let me know if you try that approach.<br /><br />To answer your question about error estimation, that is essentially a question about *sensitivity analysis*. I am not an expert in that area but eveything I've seen in the published literature, suggests you need many repeated trials or runs to use the conventional sensitivity analysis techniques.<br /><br />In lieu of that, I would suggest that the difference between 4 and 6 data points is best determined by looking at the difference b/w fitting the 2 cases, (assuming you have 6 data pts).<br /><br />Hope that gives you enough to go on with until I find time to post something more extensive. Thanks for asking such a great question.Neil Guntherhttps://www.blogger.com/profile/11441377418482735926noreply@blogger.comtag:blogger.com,1999:blog-6977755959349847093.post-765458837475645292009-04-14T12:58:00.000-07:002009-04-14T12:58:00.000-07:00thank you for sharing this post to confirm the use...thank you for sharing this post to confirm the use of Production steady state data with USL. I had planned/assumed on using steady state data, regardless of environment type, as steady state is steady state. one question tho. in the book you mentioned at least 4 data points are needed, but in the blog you mentioned 6. can you elaborate this. i assumed there would be larger error with 4 data points, but are there ways we can estimate the error? i am thinking confidence interval, but would like your insight on the error estimation. thanks.metasofthttps://www.blogger.com/profile/17149213781391733478noreply@blogger.com