In my book (Chaps. 4--6) and my classes, I discuss the importance of performing load tests in steady state. Steady state means that the SUT (system under test) is in equilibrium such that all transactions being issued ultimately are getting serviced correctly, which, in turn, means that no bits are getting dropped on the floor (and how do you know they're not?) and there is no build up queues to cause buffer overflows (which would probably be more obvious).
To assess application scalability correctly, you need to measure the steady-state throughput (transactions per hour, connections per second or whatever rate metric applies). Ultimately, you want to be able to plot out each measured average throughput value (X) at each user load-point (N) i.e., X(N) vs. N. Similarly for response times R(N). As a rule of thumb, reaching and holding steady state can easily take on the order of 10-15 minutes (depending on the size of databases, caches, and so on). Some performance engineers try to cut corners by only running each load level for a few seconds or a minute. This is a bad idea if you later want to be able to make use of the same data for quantitative scalability analysis.
Available load testing tools include:
- OpenSTA based on CORBA.
- The Grinder, a Java load testing framework.
- Microsoft's Web Application Stress tool.
- WebLOAD, open Source edition.
- On the commercial side, Mercury (HP) LoadRunner owns something like 40% of that market.
- The authors only measure (or report in the article) two load levels: N = 25 and N = 50, which is hardly useful for generating a scalability plot.
- They commit the grievous sin of changing more than one thing at once across tests viz., the user load (N) and the payload size (KB).
They should also have exhibited the throughput data as well since the discussion is about application scalability.