Friday, September 28, 2007

SOA Scalability and Steady-State

Guerrilla alumnus Peter Lauterbach just brought to my attention an article in SOA World entitled "Load Testing Web Services". I have to commend these authors for performing their SOA load tests in steady state. Elsewhere, I've discussed how wrong things can go when you don't adhere to this procedure. In their online article, these authors show the response time (R) as a time-series plot, more or less as it would appear in a measurement tool like say, LoadRunner. Although they don't show it, the throughput measurements would also look similar when plotted as a function of time (t).


In my book (Chaps. 4--6) and my classes, I discuss the importance of performing load tests in steady state. Steady state means that the SUT (system under test) is in equilibrium such that all transactions being issued ultimately are getting serviced correctly, which, in turn, means that no bits are getting dropped on the floor (and how do you know they're not?) and there is no build up queues to cause buffer overflows (which would probably be more obvious).

To assess application scalability correctly, you need to measure the steady-state throughput (transactions per hour, connections per second or whatever rate metric applies). Ultimately, you want to be able to plot out each measured average throughput value (X) at each user load-point (N) i.e., X(N) vs. N. Similarly for response times R(N). As a rule of thumb, reaching and holding steady state can easily take on the order of 10-15 minutes (depending on the size of databases, caches, and so on). Some performance engineers try to cut corners by only running each load level for a few seconds or a minute. This is a bad idea if you later want to be able to make use of the same data for quantitative scalability analysis.

Available load testing tools include:
Although, the authors have done a good job from the standpoint of steady-state measurements, Peter correctly notes two limitations that are also common mistakes:
  1. The authors only measure (or report in the article) two load levels: N = 25 and N = 50, which is hardly useful for generating a scalability plot.
  2. They commit the grievous sin of changing more than one thing at once across tests viz., the user load (N) and the payload size (KB).

They should also have exhibited the throughput data as well since the discussion is about application scalability.

1 comment:

harry van der horst said...

hello,
my name is Harry van der Horst, from the Netherlands.
I would like to add a point: Whenever I have been stresstesting we measure the 95 percentile of the responsetimes as a function of the thruput. Both based on theory and om practice the 95% is very sensitive to the early signs of saturation.
I have a rule of thumb that as soon as the 95% of the response time is 200% of the average, then disaster is lurking around the corner.

in tools like loadrunner the monitoring of bothe the average and the 95percentile is easy.
Also in the WEB world I have used the tool MONIFORCE for observing both real production behaviour and testbehaviour.

Is this helpfull?