The Pith of Performance: Crib Sheet for Emulating Web Traffic

Saturday, October 8, 2016

Crib Sheet for Emulating Web Traffic

Our paper entitled, How to Emulate Web Traffic Using Standard Load Testing Tools (PDF) is now available online and will be presented at the upcoming CMG conference in November.

Presenter: James Brady (co-author: Neil Gunther)
Session Number: 436
Subject Area: APM
Session Date: Wed, November 9, 2016
Session Time: 1:00 PM - 2:00 PM
Session Room: PortofinoB

The motivation for this work harks back to a Guerrilla forum in 2014 that essentially centered on the same topic as the title of our paper. It was clear from that discussion that commenters were talking at cross purposes because of misunderstandings on many levels. I had already written a much earlier blog post on the key queue-theoretic concept, viz., holding the $N/Z$ ratio constant as the load $N$ is increased, but I was incapable of describing how that concept should be implemented in a real load-testing environment.

On the other hand, I knew that Jim Brady had presented a similar implementation in his 2012 CMG paper, based on a statistical analysis of the load-generation traffic. There were a few details that I couldn't quite reconcile in Jim's paper but, at the CMG 2015 conference in San Antonia, I suggested that we should combine our separate approaches and aim at a definitive work on the subject. After nine months gestation (ugh!), this 30-page paper is the result.

Although our paper doesn't contain any new invention, per se, the novelty lies in how we needed to bring together so many disparate and subtle concepts in precisely the correct way to reach a complete and consistent methodology. The complexity of this task was far greater than either of us had imagined at the outset. The hyperlinked Glossary should help with the terminology, but because there are so many interrelated parts, I've put together the following crib notes in an effort to help performance engineers get through it (since they're the ones that most stand to benefit). The key results of our paper are indicated by ◀

Standard load-test tools only allow finite number of virtual users
Web traffic is characterized by an unlimited number of real users
Attention usually focused on SUT (system under test) performance
We focus on the GEN (load generator) characteristics
Measure distribution of web requests and their mean arrival rate
Web traffic should be a Poisson process (cf. A.K. Erlang 1909) ◀
Requires statistically independent arrivals, i.e., no correlations
Independent requests should arrive asynchronously into the SUT
But virtual-user requests are synchronized (correlated) in SUT queues ◀
De-correlate arrivals by shrinking SUT queues (Principle A): ◀
- Shrink by increasing think delay $Z$ with user load as a fixed ratio
- Traffic rate $\lambda$ approaches a constant $\lambda = N/Z$ as SUT queues shrink
Check requests are Poisson by measuring the coefficient of variation ($CoV$)
Require $CoV \approx 1$ for a Poisson process (Principle B) ◀

Originally, I assumed the paper would be no more than a third it's current length. Wrong! My only defense is: it's all there, you just need to read it. (tl;dr doesn't apply) Apologies in advance, but hopefully, these crib notes will help you.

4 comments:

test said...: Very interesting.
Have you thought about issues emulating IoT traffic?; Friday, July 28, 2017 at 8:35:00 PM PDT
Neil Gunther said...: Interesting question; never even occurred to me.

Can you provide some details on how IoT traffic differs from web traffic?; Saturday, July 29, 2017 at 9:01:00 AM PDT
Neil Gunther said...: FYI: Just saw this on Twitter ... How performance testing the IoT is different.; Wednesday, August 2, 2017 at 11:55:00 AM PDT
ks said...: Thanks for the article.

Indeed, I have also seen articles and papers regarding trying to improve how benchmarking and performance testing for IoT. Some others that may be of interest:

• IoTAbench: an Internet of Things Analytics benchmark: http://www.hpl.hp.com/techreports/2014/HPL-2014-75.pdf
• RIoTBench: A Real-time IoT Benchmark for Distributed Stream Processing Platforms: https://arxiv.org/pdf/1701.08530.pdf
• A Model to Evaluate the Performance of IoT Applications http://www.iaeng.org/publication/IMECS2017/IMECS2017_pp147-150.pdf
• IoT TCP: http://www.tpc.org/tpc_documents_current_versions/pdf/tpcx-iot_v1.5.x.pdf

For emulating IoT traffic, most I believe are using non-standard tooling to emulate that devices. For example, a cluster to run containers that have device logic in them. So, they may indeed test up to the actual number of emulated devices vs. simulating with a combination of virtual users and think time. So this may be a difference in approach.

I think a potential large different between IoT solutions and Web Apps s is that they:
1. Are messaging based: data flow of message sources and sinks
2. Implement hot, warm, and cold paths
3. Leverage, Real-time analytics (CEP, Stream processing), In-memory computing (Spark, etc.), Indexed Storage (Solr, etc.), MapReduce (Spark, Hadoop)
4. Can think of IoT as data flow of sources and sink

(1) Event Production -> (2) Event Queueing & Stream Ingestion -> (3) Stream Analytics -> (4) Storage & Batch Analysis -> (5) Presentation and Action

And technologies below:

1) Devices and Gateways
2) Azure Event Hubs, IoT Hub ; or Kafka
3) Azure Stream Analytics; or Spark Streaming
4) Azure Data Lake, CosmosDB, SQL Database, SQL Data Warehouse; or Spark, Hadoop [*]
5) Microsoft Power BI; or Tableau

[*] http://perfdynamics.blogspot.com/2015/03/hadoop-scalability-challenges.html. Big Data is usually part of IoT solution.

With this type of complexity, it seems critical to have telemetry needed to do performance and scalability analysis.

1. Ideally you would capture telemetry for each message with timestamps at key points in data flow, including correlation id for end to end visibility
2. Or if that is not possible (because development has not been done, or throughput is too high) just capture metrics that are needed for performance and scalability analysis:
At important points along message data flow capture: time, count, rate. What metrics would you recommend capturing in a messaging system like this? For example in (#2) in stream ingestion code, this may be cluster of N VMs mapped 1:1 to a partition in IoT Hub, each processing messaging in batches of X messages. Then sending for further processing to (#3) Stream Analytics or for storage, analytics to (#4) Data; Sunday, August 6, 2017 at 7:31:00 AM PDT