HowTo: Generate 1 Million Virtual Users with Load Tester 5.0 PRO

Add Your Comment
19 March 2012
Posted in Load Testing, WP Load Tester

Leading up to the release of Load Tester 5.0, the Web Performance development team focused heavily on improving our capability to run massive load tests. Today, Load Tester 5.0 is specifically engineered to deliver as many as 1 million virtual users while controlling 500 remote load engines.

This is a “how to” article for Load Tester 5.0 users wanting to run their own massive load tests.

Before you Start

There are a few things you absolutely need. First and foremost is a modern workstation for the controller. By modern, I mean a 64-bit architecture with at least 7 GB of working memory. This is an absolute must. To benefit from the 64-bit architecture, it will be necessary to run the 64-bit version of the controller as well. CPU power is a less critical: We ran successful tests with an in-house Intel i7-860 and an Amazon EC2 High-CPU Extra Large Instance (c1.xlarge). The controller can take advantage of up to three hardware threads.

You may also save time during the engine-initialization phase by having plenty of upload bandwidth, especially if you have a testcase that uploads a lot of data or if you have a very large dataset that is marked both re-usable and sharable. Most testcases don’t fit this description, but if this does describe you, consider installing Load Tester on an Amazon cloud instance where it will have oodles of bidirectional bandwidth.

(Remember, these specifications are for those wanting to run load tests to 1 million users — for a typical 1-hour test running between 100 and 10,000 users, system requirements are largely a non-issue.)

Finally, you’ll need a lot of cloud instances. Load Tester is integrated with Amazon EC2, so you can launch arbitrary numbers of cloud instances from within Load Tester at the push of a button. You’ll need to estimate the number of cloud instances required to successfully execute your test. For example, if you run 10 load engines to 20,000 virtual users before they cap out, then you’ll need a bare minimum of 500 load engines to run to 1 million users. Not all test cases will scale this efficiently, especially those that have very short think times. Load engines self-monitor and will stop adding virtual users when they are at capacity.

Be aware that Amazon imposes a quota on the maximum number of cloud instances that can be run by any customer at any given time. The default quota at this time is 20 engines per region. It usually takes a few days for the quota increase to be approved, so plan accordingly. (Amazon has a contact form for this specific purpose at: http://aws.amazon.com/contact-us/ec2-request/.)

This probably goes without saying, but your website will need to be reachable from the Amazon cloud.

Tuning Web Performance Load Tester for Maximum Performance

You’ll need to allocate extra memory to Load Tester, and you’ll need to do this up-front so that Load Tester’s pre-flight self-checks don’t indicate a shortage. Edit the “webperformance.ini” file in Load Tester’s installation directory. The two lines “-Xms1000m” and “-Xmx1000m” (the actual value you see might be different, but the command-line flag is the same) control the size of Load Tester’s memory heap. If you set this number too low, Load Tester will run out of memory. If you set it too high, you might also starve certain processes that live outside of the heap. About 2/3rds of available hardware memory is a good rule of thumb. For example, on an 8-gigabyte workstation, I would set this to about 5500m.

In the load configuration, you should disable the options named “Detailed Page Durations” and “Individual URL Metrics.” Also, under “Error Recording”, don’t tune the “Number of descriptions” or “Number of pages” settings above their default values — these are already counted per-engine. These options involve features of Load Tester that don’t scale well to hundreds of thousands of users. “Detailed Page Durations” in particular collects data samples on a per-page-load basis.

Evaluating the Health of your Load Test

Best practice is to consider the failure of even one load engine to invalidate the test results from the moment of failure forward. When testing the high-performance features of Load Tester, we never encountered a load engine failure mid-test; however, as with all engineered systems, increasing the number of components increases the probability of a failure. Evidence of a lost load engine will first manifest when composite live statistics (such as the yellow average page duration chart) stop updating. These statistics are only updated after all load engines have reported. Eventually, either the backlog will clear or Load Tester’s built-in self-checks will determine that a material failure has occurred and report it explicitly.

Use the Engines View to monitor the health of the load engines as they are running. This view monitors CPU utilization, memory, upstream and downstream bandwidth, and ping time (round-trip latency). Excessive values for any of these numbers can be considered evidence of a problem.

The Status View monitors the health of the controller. The “Memory” sub-heading of this view can report memory pressure. You can allocate more memory to Load Tester’s heap (using the procedure described above) if this value seems problematic. Using the “Cleanup…” tool before each load test to remove unneeded materials from the repository can also reduce memory pressure. The previous advice, regarding the “Detailed Page Durations” and “Individual URL Metrics” also strongly impact memory utilization.

The “Diagnostic” sub-heading monitors Load Tester’s two data processing pipelines and will only show activity during massive load tests. The visualization queue handles chart updates, and you can clear any backlog on this queue by disabling unneeded charts. The metric storage queue is a critical function and should not exceed zero, although momentary bursts are acceptable; if this queue is overloaded consider moving the controller to a machine with more CPU and memory.

Summary of Common Problems, Causes, and Solutions when Running Massive Load Tests

Chart updates are slow or lag behind the actual running time of the test: Caused by displaying too much chart data or temporarily or permanently losing contact with at least one load engine. Consider closing unnecessary charts, especially charts with many elements such as those in the engines view. If you have permanently lost contact with a load engine, the test is no longer valid after that time.
Load tester reports an “out of memory” condition: Caused by configuring Load Tester’s memory heap too low or too high, or because the load test is gathering more data than can be stored in available memory, or because pre-existing contents of an open repository are eating up memory that would otherwise be used for the load test. Remember that you need to turn off “Detailed Page Durations” and “Individual URL Metrics.” Consider using the “Cleanup…” tool to cut down on repository size, and allocate more memory to Load Tester if the memory status indicator indicates a shortage. Don’t expect to run a massive load test using the 32-bit version of Load Tester.
It takes too long to begin the test: Caused by a combination of large datasets or file uploads in the test case and insufficient upstream bandwidth. Consider installing the controller on the Amazon cloud to net more bandwidth and lower latency to the load engines.
Load Tester stops adding users: This is a common problem even for small or moderate load tests. You don’t have enough engines to generate the requested load.
Other problems: If you encounter a problem not listed here, I want to know. Please file a ticket with Web Performance’s support tracker.

Resources

DesignHammer – A Durham web design company

919.845.7601 Mon – Fri 9am – 5pm ET

HowTo: Generate 1 Million Virtual Users with Load Tester 5.0 PRO

Before you Start

Tuning Web Performance Load Tester for Maximum Performance

Evaluating the Health of your Load Test

Summary of Common Problems, Causes, and Solutions when Running Massive Load Tests

Related Posts:

Add Your Comment

Resources

(1) 919-845-7601 9AM-5PM EST