In my book (Chaps. 4--6) and my classes, I discuss the importance of performing load tests in steady state. Steady state means that the SUT (system under test) is in equilibrium such that all transactions being issued ultimately are getting serviced correctly, which, in turn, means that no bits are getting dropped on the floor (and how do you know they're not?) and there is no build up queues to cause buffer overflows (which would probably be more obvious).
To assess application scalability correctly, you need to measure the steady-state throughput (transactions per hour, connections per second or whatever rate metric applies). Ultimately, you want to be able to plot out each measured average throughput value (X) at each user load-point (N) i.e., X(N) vs. N. Similarly for response times R(N). As a rule of thumb, reaching and holding steady state can easily take on the order of 10-15 minutes (depending on the size of databases, caches, and so on). Some performance engineers try to cut corners by only running each load level for a few seconds or a minute. This is a bad idea if you later want to be able to make use of the same data for quantitative scalability analysis.
Available load testing tools include:
- OpenSTA based on CORBA.
- The Grinder, a Java load testing framework.
- Microsoft's Web Application Stress tool.
- WebLOAD, open Source edition.
- On the commercial side, Mercury (HP) LoadRunner owns something like 40% of that market.
- The authors only measure (or report in the article) two load levels: N = 25 and N = 50, which is hardly useful for generating a scalability plot.
- They commit the grievous sin of changing more than one thing at once across tests viz., the user load (N) and the payload size (KB).
They should also have exhibited the throughput data as well since the discussion is about application scalability.
hello,
ReplyDeletemy name is Harry van der Horst, from the Netherlands.
I would like to add a point: Whenever I have been stresstesting we measure the 95 percentile of the responsetimes as a function of the thruput. Both based on theory and om practice the 95% is very sensitive to the early signs of saturation.
I have a rule of thumb that as soon as the 95% of the response time is 200% of the average, then disaster is lurking around the corner.
in tools like loadrunner the monitoring of bothe the average and the 95percentile is easy.
Also in the WEB world I have used the tool MONIFORCE for observing both real production behaviour and testbehaviour.
Is this helpfull?